Abstract
Importance: Occupational therapy practitioners need modern tools for the assessment of maximal grip strength in clinical and remote settings.
Objective: To establish the (1) interrater reliability and (2) precision of the GripAble among three raters with different expertise in occupational therapy when testing healthy participants, and to (3) evaluate the relative reliabilities of different approaches to estimating grip strength (i.e., one trial, mean of two trials, and the mean of three trials).
Design: Measurement study.
Setting: Minnesota Translational Musculoskeletal and Occupational Performance Research Lab, University of Minnesota, Minneapolis.
Participants: Thirty volunteers, age ≥18 yr, without any hand problems.
Outcomes and Measures: Using GripAble, three occupational therapy raters with varied experience measured the maximal grip strength of the dominant and nondominant hands of all participants. Using the mean of three trials when testing grip strength with GripAble adds precision.
Results: GripAble has excellent interrater reliability (i.e., intraclass correlation coefficient > .75) and acceptable precision (minimal detectable change < 15%) among healthy adults.
Conclusions and Relevance: GripAble allows occupational therapy practitioners with different experiences to assess grip strength in healthy hands quickly, precisely, and with excellent reliability. Additional research is needed on its psychometrics in clinical populations and capacities in remote monitoring and exergaming.
Plain-Language Summary: The results of this study show that grip strength, an important biomarker and commonly assessed construct in occupational therapy, can be evaluated reliably, precisely, and rapidly with GripAble. The use of GripAble by occupational therapy practitioners in clinical settings may help to build an infrastructure for remote measurements and exergaming interventions in the future.
Hand grip strength (HGS) is an indicator of general muscle strength (Stevens et al., 2012) and activity level across the lifespan (Pérez-Parra et al., 2024). It is associated with overall health status, including cognitive function (Jiang et al., 2022), morbidity and mortality (Roberts et al., 2011), future disability (Rantanen et al., 1999), and declining health-related quality of life (Sayer et al., 2006). Thus, public health experts offer that HGS should be a routine biomarker in older adults (Ibrahim et al., 2018).
In addition to its role as a biomarker, HGS is commonly assessed in occupational therapy practice settings after the onset of a disease or injury that results in upper limb weakness (Che Daud et al., 2016). Measurements can be used to determine the client’s HGS impairment according to existing normative data (Mathiowetz et al., 1985), monitor the change seen in the client during the occupational therapy process, or establish a criterion for return to work after upper extremity injury (Berryhill, 1990).
Different clinical instruments are available for measuring HGS. HGS can be quantitatively evaluated with hydraulic, pneumatic, mechanical, or electrical handgrip dynamometers (Blomkvist et al., 2016). The Jamar hydraulic hand dynamometer (Performance Health, n.d.) has high reliability and validity when properly calibrated (Innes, 1999). It is the gold standard method (Hogrel, 2015) that has been used in many studies to evaluate handgrip strength (Roberts et al., 2011), has normative values for different nations and cultures (Ekşioğlu, 2016; Lee & Hwang, 2019; Mitsionis et al., 2009; Werle et al., 2009), and has been recommended for use by the American Society of Hand Therapists (ASHT; MacDermid et al., 2015).
Despite its well-established measurement properties (Mathiowetz et al., 1984), the Jamar has several disadvantages, such as having a laborious and regular calibration process, a resolution of only 2 kg (Roberts et al., 2011), and a limited capacity to measure values in its lower ranges (Mace et al., 2022). Thus, it may be that the Jamar is not appropriate for all clinical populations (Tyler et al., 2005), particularly those with profound weakness (Massy-Westropp et al., 2004), and inadequate for detecting small changes in strength (Richards et al., 1996).
Lastly, analog technologies, such as those used by the Jamar, do not allow for the coupling of HGS monitoring with gaming applications. This limitation is considerable given the recent uptake of gamified rehabilitation (Stamate et al., 2023) and that marrying rehabilitation gaming applications designed for restoring physical performance with monitoring technologies has been proven to have positive effects on clients’ motivation to engage in rehabilitation (Taylor et al., 2018).
GripAble (Figure 1) is a multipurpose device with the capacity to monitor and train distal upper extremity motor function through gaming applications. The HGS monitoring component of the device has high resolution (0.1 kg), excellent accuracy, and high concurrent validity with the gold standard, Jamar (Mace et al., 2022; Mutalib et al., 2022). In addition to the load cells used to quantify HGS, GripAble contains inertial motion sensors that can monitor movements of the wrist and forearm in all directions. This same technology also supports the retraining of grip as well as forearm and wrist motions through the use of several rehabilitation exergames applications (Benzing & Schmidt, 2018). Although no universal definition for exergaming exists, Gao et al. (2016) suggested that video games requiring bodily movement to play and function as a form of physical activity can be used more accurately to describe exergaming.
Although the GripAble device affords new opportunities for measuring and training distal upper extremity motor function and improving client engagement (Karamians et al., 2020), its measurement properties are largely understudied. In particular, because rehabilitation decisions are often based on how far they are outside of typical or normal (Werle et al., 2009), GripAble HGS reference values are needed but are currently unestablished. Additionally, for such a normative data set to be established, numerous evaluators are usually needed. Given these factors, and that occupational therapy clients may need to have their hand strength progress assessed by different therapists, it would be reasonable to question whether scores might vary among differing raters and, similarly, how rater experience might factor into such agreement (McHugh, 2012). Although GripAble is accurate (Mace et al., 2022) and has high interinstrument reliability with Jamar+ (Mutalib et al., 2022), no studies currently exist on the agreement among raters (i.e., interrater reliability).
Because it is recommended to first test instrument psychometrics in healthy people (Werle et al., 2009), and because it is a classical practice to compare the strength of the affected with the contralateral unaffected hand (Bulut et al., 2018), GripAble’s interrater reliability should first be established in those without known upper limb performance limitations (McGee et al., 2023). Establishing the reliability and precision (i.e., measurement error) of the GripAble will inform our confidence in our clinical assessment findings and set the stage for future research on its reference values and the efficacy of its gaming applications.
Finally, although the standardized procedures for HGS assessment rarely vary, how HGS is estimated and reported does (i.e., one trial vs. the average of two or three) and so does the reliability of these different approaches (Bai et al., 2019; Coldham et al., 2006). For this reason, it is important to know how the reliability of a new tool varies according to the number of trials performed (i.e., one, two, or three).
The primary aim of this study was to establish the interrater reliability and precision of the GripAble among three raters with different expertise in occupational therapy when testing people without any hand or upper extremity problems. A secondary aim was to evaluate the relative reliabilities of different approaches to estimating maximal grip strength (i.e., one trial, mean of two trials, and the mean of three trials).
Method
Design
A cross-sectional study design was used to assess the interrater reliability and precision of the GripAble device. The study was designed and carried out per the measurement studies guidelines put forth by COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN; Mokkink et al., 2010). The study was approved by University of Minnesota’s Institutional Review Board (STUDY00019264).
Participants
A total of 30 adults (60 hands) were included in this study. Purposeful sampling methods were used to recruit male and female adults age 18 yr or older from a large midwestern state fair who represented the entirety of the adult lifespan, both sexes (assigned at birth), and the racial and ethnic diversity of the metropolitan area. Participants who reported having any central or peripheral nervous system disorder, a history of surgery, fracture or rheumatological disease affecting the upper extremity, as well as cardiovascular and cardiac conditions that precluded them from strength testing were not included. Additionally, participants who experienced increased pain during the tested hand and fingers or who could not follow the standardized procedures were also excluded. Informed consent was obtained from all participants before the test. Hand dominance was determined following the methods used for the motor domain tests of the NIH Toolbox (Reuben et al., 2013). Informed consent was obtained from all participants.
Testing Personnel
Three raters collected grip strength data on each participant. The raters had different backgrounds in academic training and clinical experience. Rater 1 was an occupational therapist who had an MS and a PhD in occupational therapy, with approximately 10 yr of experience (7 yr experience in hand therapy). Rater 2 was an occupational therapy doctorate student in his final year. Rater 3 was an occupational therapist who had an MS in occupational therapy and a PhD in rehabilitation science, with approximately 23 yr of experience (17 yr experience as a certified hand therapist [CHT]). The reason for including raters with varied clinical experience was to investigate the reliability and precision of the tool when administered by clinicians with different levels of clinical experience.
Data Collection
Demographic data were gathered at the recruitment venue some 2 wk before GripAble data collection. Testing was performed in a research laboratory setting at the University of Minnesota. Three GripAble devices were used across the study period; however, all testers only used the same device when testing the same participant. Before the testing session, the device used for testing was calibrated by the first rater following the manufacturer’s instructions. As per the methods used in the motor domain of the NIH Toolbox (Reuben et al., 2013), each rater performed three trials of grip strength testing for each hand, always beginning with the dominant hand and alternating hands after each trial. Thus, each participant was tested a total of 18 times by three raters (9 times on each side), in which the sequencing of rater testing was conducted in a randomized sequence by using a random number generator (https://www.random.org) to control any order effects.
Each session took approximately 35 min in which participants rested for 15 s between trials and 10 min between each rater. The rest period durations are consistent with those recommended by ASHT, which recommends a 15-s rest between trials of the same hand and a minimum of a 3-min rest between sets (MacDermid et al., 2015). During testing, HGS values were displayed in “kg” on the screen of a digital tablet that was shielded from the participant’s view (see Figure 1). Data were then recorded on a separate electronic tablet using the Qualtrics Survey Tool (https://www.qualtrics.com), in which steps were taken to close out the GripAble application and advance the survey page so that each tester was blinded to the previous findings.
Testers followed the standardized positions recommended by ASHT (Figure 2) to evaluate HGS (MacDermid et al., 2015) and the standardized instructions used in the NIH Toolbox (Reuben et al., 2013). To avoid the risk of bias and to ensure methodological quality, we followed COSMIN in this study (Mokkink et al., 2010).
Data Analysis
IBM SPSS Statistics (Version 27) was used to complete statistical analyses. Descriptive statistics were used to describe the sociodemographic characteristics of the sample (age, gender, dominant hand, ethnicity, height, and weight), to calculate the mean of two and three grip strength trials for each hand, and to describe the overall mean and variance of grip strength measurements obtained by each rater. An α value of .05 was considered significant for all statistical analyses.
Reliability
The intraclass correlation coefficient (ICC) was used to assess the interrater reliability when reporting the results of one trial (average of first grip strength measurement [T1]), the average of the first and second grip strength measurements (T2), and the average of three grip strength measurements (T3) for both dominant and nondominant hands. For this study, ICC2,3 (Hallgren, 2012) was the preferred test statistic because it is most appropriate for interrater reliability studies in which error may originate from the rater or the participant, multiple raters are involved, and raters are representative of the rater population (Koo & Li, 2016). Whereas the coefficient of “1” indicates perfect reliability, “0” indicates the opposite, and an ICC of ≥.75 indicates excellent reliability (Fleiss, 1986).
Precision
When interpreting the acceptability of an measure’s precision, authors have proposed that an SEM% of less than 10% is clinically acceptable (Buckinx et al., 2017), whereas others have suggested that SEM% and MDC% values less than 15% and 30%, respectively, indicate acceptable precision (Smidt et al., 2002).
Sample Size
On the basis of the findings of Mathiowetz et al. (1984), who described their lowest test–retest reliability ratings to be .99 using a Jamar grip dynamometer, a minimum sample size of 7 participants was required to achieve a conservative ICC estimate of .90 (p = .05; power = 80%).
Results
Data from 30 adults and 60 hands were collected. The ratio of male to female participants was 1:1, and participants were predominantly White (56.7%) and entirely non-Hispanic. The racial demographics were largely congruent with those of the metropolitan area where the testing occurred; however, those with a Hispanic ethnicity were underrepresented (i.e., 0% vs. 8.7%; U.S. Census Bureau, 2022). The percentage of participants who were right dominant was 90%, which accurately reflects the population (Llaurens et al., 2009). The average age of the sample was 48.3 yr (SD = 17.9). Additional characteristics of the participants are included in Table 1.
Reliability and Precision
The reliability and precision findings are presented in Table 2. All ICC values were >.75, indicating excellent reliability (Fleiss, 1986). The average grip strength measured by all raters for the dominant hand was 25.55 kg (SD = 7.48) and 23.87 kg (SD = 7.31) for the nondominant hand. The ICC values for the dominant side T1, T2, and T3 were .95, .95, and .98, respectively, and the ICC values for the nondominant side T1, T2, and T3 were .95, .95, and .99, respectively. Although T3 trended toward having higher ICC values in both hands, each measure’s 95% confidence interval (CI) contained the ICC values of all of the other measures, so none were not statistically superior or inferior (McGraw & Wong, 1996).
The SEM, MDC, SEM%, and MDC% were calculated to determine precision. Across hands and reporting methods, the SEM and MDC ranged from 0.8 kg and 1.7 kg and 2.3 kg and 4.9 kg, respectively. For T1, the SEM values were 1.7 kg and 1.6 kg, and the MDC values were 4.9 kg and 4.6 kg, for dominant and nondominant hands, respectively. For T2, the SEM values were 1.6 kg and 1.5 kg, and the MDC values were 4.6 kg and 4.3 kg, for dominant and nondominant hands, respectively. Finally, for T3, the SEM values were 0.9 and 0.8 kg, and the MDC values were 2.6 and 2.3 kg, for dominant and nondominant hands, respectively.
Lastly, across hands and reporting methods, SEM% values ranged from 2.7 to 5.3, and MDC% values ranged from 7.6 to 14.8. The SEM% values were 4.8 and 5.3, and the MDC% values were 13.5 and 14.8, for dominant and nondominant T1, respectively. The SEM% values were 4.8 and 5.2, and the MDC% values were 13.5 and 14.4, for dominant and nondominant T2, respectively. Finally, the SEM% values were 2.8 and 2.7, and the MDC% values were 7.8 and 7.6, for dominant and nondominant T3, respectively. According to the aforementioned cutpoints for acceptable measurement error, all SEM% and MDC% values fell into the acceptable range. However, it should be noted that the MDC% values of T3 were only 57% and 51% as large as those of T1 in the dominant and nondominant hands, respectively.
Discussion
HGS measurement is a commonly studied topic that is of clinical significance to numerous populations seen by occupational therapy practitioners and other health professionals (Blomkvist et al., 2016; Huang et al., 2022; Mace et al., 2022; Mathiowetz et al., 1984; Mutalib et al., 2022). GripAble is a novel digital device that can be used by occupational therapy practitioners to measure and treat distal upper extremity motor function (Mace et al., 2022; Mutalib et al., 2022); however, the psychometrics of its HGS assessment capacities were not previously studied.
In this study, we investigated the interrater reliability and precision of GripAble when used by three raters with varied clinical experience who assessed 30 healthy participants without hand and upper extremity problems. Healthy adults were chosen because reliability testing is often first performed in nonclinical populations and because HGS of nonaffected extremities is often used as a comparator in practice (Wang et al., 2018). Although interinstrument reliability with the Jamar has been established (Mutalib et al., 2022), we are, to the best of our knowledge, the first to examine the interrater reliability and precision of GripAble.
According to the recommendations on interpretation made by Fleiss (1986) and Smidt et al. (2002), our findings revealed that GripAble has excellent interrater reliability and precision when used to assess HGS in those without upper limb limitations. Therefore, we offer early evidence to support that among occupational therapy practitioners with varying levels of experience, GripAble and the testing procedures we describe can be reliably used to assess unaffected HGS. Moreover, when standard testing procedures are used (i.e., three trials of maximal HGS) to assess nonaffected HGS, our findings suggest that an occupational therapy practitioner can be 90% confident (i.e., MDC90) that their assessment findings will be within ±2.6 kg and ±2.3 kg of their colleagues when testing healthy dominant and nondominant hands, respectively.
In the context of other literature, GripAble has highly comparable interrater reliability with that of the gold standard Jamar when assessing heathy hands. When using the GripAble and the same three-trial method used with Jamar, we report ICC values of .98 to .99, whereas Mathiowetz et al. (1984) and Peolsson et al. (2001) reported ICCs of .97 to .99 when using the gold standard Jamar (Hogrel, 2015). At present, our review of the literature suggests that only Schreuders et al. (2003) reported on the precision of a handgrip dynamometer within the context of an interrater reliability study in healthy-handed adults. When using the Lode dynamometer and the same standardized procedures that we followed (MacDermid et al., 2015), they reported an SEM of 3.36 kg and an MDC of 9.38 kg. This measurement error is 3.5 to 4.0 times higher than that which we report (i.e., SEM of 0.8–0.9 kg and MDC of 2.3–2.6 kg) when using GripAble.
In regard to our second objective (i.e., exploring the influence of testing practices on reliability), our findings support that all three methods have similar reliability; however, the addition of the third trial appears to add additional precision. The literature on this topic is mixed. For example, among those with distal upper extremity trauma, some researchers have found one trial (Kennedy et al., 2010) to be as reliable as the average of three, whereas others have reported the average of three trials to have superior reliability (MacDermid et al., 1994). Similarly, Mathiowetz et al. (1984) suggested that the reliability of three trials is superior to two trials or one trial, and Maher et al. (2018) suggested that one or three trials have comparable reliability in healthy adults. Given the added precision of three trials, our recommendations on GripAble are more congruent with those of Mathiowetz et al. (1984). Given that these studies focused solely on the reliability of analog dynamometers, it is our understanding that we are the first to make such comparisons while using a digital dynamometer.
Given our findings, GripAble’s high concurrent validity with Jamar (Mutalib et al., 2022) as well as its uncomplicated and practical calibration process (Mutalib et al., 2022) show that evidence is building to support the use of GripAble in HGS assessment.
Although a few digital dynamometers are currently described in the literature, the GripAble and Squegg (Bairapareddy et al., 2023) are the only ones that offer measurement and exergaming functions. Each has similar concurrent validity with the gold standard Jamar (Bairapareddy et al., 2023; Mutalib et al., 2022); however, unlike GripAble, Squegg’s interrater reliability is presently unknown. Moreover, the GripAble has hardware that can both assess and train hand kinetics as well as wrist and forearm kinematics (i.e., range of motion), whereas the Squegg only contains hardware to assess and train hand kinetics (i.e., specifically grip force).
Limitations
This study has some limitations. The first of these limitations is the sample size. It would have been preferable to conduct this study with a larger sample size. Although the measurement of grip strength is frequently performed for healthy participants and participants with any condition, the sample group of this study consisted of healthy participants. Therefore, our findings cannot be generalized to any population with pathology affecting the hand and upper extremities. Although it was not among the aims of the study, GripAble cannot provide data comparable with Jamar’s normative data because its dimensions are different from Jamar (Magni et al., 2023).
Directions for Future Research
The results of our study indicate that further study on the GripAble is justified. Given that it is not always possible to evaluate the HGS of the contralateral hand to make comparisons, normative data can help to inform clinical decision making. However, because HGS norms have been gathered using a digital Jamar (Bohannon et al., 2019), and because GripAble’s HGS readings are described as being 69% (95% CI [68%–71%]) of those of Jamar (Mutalib et al., 2022), new norms are needed for GripAble.
Additionally, because our results were obtained from a healthy sample group, they cannot be attributed to people with hand conditions. Therefore, we recommend that this study be replicated in clinical populations affected by specific upper extremity conditions. Similarly, no published studies can be found on GripAble’s other measurement properties (e.g., test–retest reliability, responsiveness, minimal clinically important difference), and additional research is needed to fill this void.
Finally, although our methods involved the use of standard clinical practices that are performed under the direct supervision of a clinician, the GripAble also has the capacity to guide clients through the process of self-administering HGS testing. Our reliability and precision findings cannot be generalized to this unsupervised approach; however, given the emphasis on and expansion of telerehabilitation services (Havran & Bidelspach, 2021), further study on the psychometrics of remote and self-administered HGS testing with the GripAble is required. Whether in a clinical or telehealth context, emerging evidence suggests that exergaming is accessible, engaging, and effective (Karamians et al., 2020). Although we offer findings that support GripAble’s use as a HGS assessment tool, and although its capability to offer interventions for distal upper extremity motor function is encouraging, the effects of its exergaming feature in both clinical and telehealth or remote settings are untested and require additional study.
Implications for Occupational Therapy Practice
The results of this study have the following implications for occupational therapy practice:
▪ GripAble is a novel upper extremity motor function exergaming and grip strength monitoring device.
▪ When using GripAble to evaluate the HGS of a nonaffected upper extremity, occupational therapy raters with differing levels of clinical experience can have highly reliable findings and can most successfully minimize measurement error when administering three trials.
▪ Future study is needed to determine GripAble’s HGS measurement properties among populations with upper extremity conditions (e.g., stroke, peripheral nerve injuries, fracture, lateral epicondylosis, osteoarthritis) when being administered by both clinicians and clients alike. Moreover, the effectiveness of its exergaming interventions also deserves additional exploration.
Conclusion
In this study, we were successful in meeting our aims of establishing (1) the interrater reliability and precision of the GripAble and (2) the most reliable approach to estimating HGS when using GripAble in those without hand impairments. Our study provides early evidence to support its use in measuring HGS in occupational therapy practice because of its excellent reliability and precision in this population. Although some variability can be found in the literature on how to most reliably estimate and report HGS with dynamometry, our findings suggest that, although all three approaches (estimating on the basis of one, two, or three trials) yield excellent reliability, taking the average of three trials yields the highest precision. For this reason, we recommend administering three trials and reporting their average.
Future research can build on our preliminary findings. Should GripAble prove to have robust measurement properties among diverse populations who are assessed in both clinical and remote settings, occupational therapy practitioners will be more equipped to monitor clients where they are most accessible. Similarly, should GripAble’s exergaming applications be effective in both settings, occupational therapy practitioners will be better able to intervene with clients where they are most accessible. With the addition of technologies such as GripAble, clients may have improved access to occupational therapy, more options for mass practice and higher dosages of motor training, improved capacities to self-monitor progress, improved adherence, and fewer barriers to occupational performance and participation.
Acknowledgments
The results of this study were previously presented at the Minnesota Occupational Therapy Association Conference, September 2023. This research was supported, in part, by National Center for Advancing Translational Sciences Grant KL2TR002492 and the TÜBİTAK-BIDEB 2219 scholarship program. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Advancing Translational Sciences or the TÜBİTAK-BIDEB.