Test-Retest Reliability of Muscle Thickness, Echo-Intensity and Cross Sectional Area of Quadriceps and Hamstrings Muscle Groups Using B-mode Ultrasound

Ultrasound muscle images have been extensively used as tools for investigating, diagnosing and monitoring thigh muscles. However, there is a lack of information examining ultrasound reliability of quadriceps and hamstrings images for research and clinical use. Objectives: To determine the reliability of muscle thickness (MT), echo intensity (EI) and cross sectional area (CSA) of quadriceps and hamstrings muscle groups. Methods: Single transverse images of the rectus femoris (RF), vastus intermedius (VI), vastus medialis (VM), vastus lateralis (VL), biceps femoris long head (BFlh), semitendinosus (ST), and semimembranosus (SM) muscles were scanned in the right and left legs of ten healthy collegiate men (age 23.4 ± 2.2 yrs, mass 71.7 ± 11.7 kg, height 1.73 ± 0.06 m) between two sessions with one day interval. Intraclass correlation coefficients (ICCs), standard error of measurement (SEM), and minimum difference to be considered “real” (MD) were measured for MT, EI, and CSA. Results: A range of 0.97-0.99, 0.83-0.88, and 0.86-0.97 (ICC); 0.72-1.38, 2.73-3.41, and 0.36-1.04 (SEM); and 2.01-3.82, 7.56-9.46, and 0.99-2.89 (MD) were found for quadriceps muscles, and 0.93-0.99, 0.74-0.90, and 0.89-0.96 (ICC); 0.73-1.94, 3.29-4.98, and 0.69-1.08 (SEM); and 2.03-5.38, 9.13-13.81, and 1.91-2.98 (MD) were found for hamstrings muscles. Conclusions: These results suggest that ultrasound imaging of both quadriceps and hamstrings muscle architecture is a reliable technique for assessing thigh musculoskeletal tissue. The anatomical sites, as well as ultrasound adjustments, images, and results utilized here may assist future researchers and clinicians as reference tools when measuring quadriceps and hamstrings musculature.

Reliability can be defined as relative consistency of variance and may be assessed in different ways (Weir, 2005).A variety of studies have used ICC to determine test-retest reliability (Cadore et al., 2014;Cadore et al., 2012;Caresio et al., 2015;Jenkins et al., 2015;Melvin, Smith-Ryan, Wingfield, Fultz, & Roelofs, 2014;Palmer et al., 2015;Pinto et al., 2014;Radaelli et al., 2013a;Radaelli et al., 2013b;Rech et al., 2014).Intraclass correlation coefficient (ICC) has no unit of measurement, and is estimated from the proportion of variance in a mean or set of scores (true score variance) and variance of error between scores (Weir, 2005).These values vary between 0 and 1.0, where 0 indicates low reliability and high error, and 1.0 indicates high reliability and low error (Munro, 2005;Weir, 2005).MT is defined as the distance between the adipose-muscle upper interface and the lower interface of muscle or bone (Cadore et al., 2014;Pinto et al., 2014;Rech et al., 2014).Validity of MT has been previously described for the anterior and posterior thigh muscles (Abe, Loenneke, & Thiebaud, 2016;Noorkoiv et al., 2010;Palmer et al., 2015).Radaelli et al. in 2013b reported test-retest ICC ranging from 0.85-0.95for overall quadriceps MT.Abe et al. in 2016 also found that the usage of ultrasound was a highly reliable method in measuring hamstrings MT.However, reliability of MT has not been described for specific muscles of the quadriceps and hamstrings.
CSA has been used as a measurement of neuromuscular adaptations in size due to growth, diet or resistance training (Croix, Deighan, & Armstrong, 2003;Watson, McPherson, & Starr, 2008).Assessing CSA from panoramic images is a recent advancement in ultrasound technology, but this technique requires significant practice and its reliability may be affected by the degree of curvature of the muscle (Jenkins et al., 2015).Although previous studies using panoramic images have reported CSA reliability of thigh muscles ranging from 0.89-0.98,single transverse images may be simpler and the image taking process more time efficient (Jenkins et al., 2015;Palmer et al., 2015).Additionally, CSA can be measured with single transverse imaging that allow a greater depth capability, especially in smaller muscles (Ahtiainen et al., 2010;Palmer et al., 2015).Muscle EI is measured through greyscale quantification of muscle images (Caresio et al., 2015).It may be used to assess muscle's intramuscular fat and/or fibrous composition (Caresio et al., 2015;Jenkins et al., 2015;Palmer et al., 2015;Pillen et al., 2009;Scholten, Pillen, Verrips, & Zwarts, 2003), as well as for identification of neuromuscular disorders (Arts, Pillen, Schelhaas, Overeem, & Zwarts, 2010;Pillen et al., 2009).A few studies have reported that EI is reliable for predicting muscle quality of thigh muscles, with ICCs ranging from 0.71-0.91(Cadore et al., 2014;Palmer et al., 2015).However, Caresio et al. 2015 in a recent study reported that, the reliability of RF echo intensity may depend on the location, size and shape of the region of interest (ROI) measured in the muscle.
Although previous research has extensively used ultrasound as a tool for individually investigating thigh muscles, there is limited information on single transverse ultrasound imaging reliability for quadriceps and hamstrings muscle groups.Given the more common use of ultrasound in lower-extremity muscles for identification of muscle architecture changes, the aim of this study was to determine the reliability of MT, EI and CSA, as well as to provide anatomical sites and ultrasound adjustments as reference tools.A secondary aim was to establish the reliability of specific muscles that are deep and/or less commonly assessed in ultrasound measurements, such as biceps femoris long head (BFlh), vastus intermedius (VI), semitendinosus (ST) and semimembranosus (SM).The rationale for choosing all quadriceps and hamstrings muscles are that these compose the greatest muscle groups of the thigh (Lutterbach-Penna et al., 2014), and therefore are involved in most lower limb daily and sport activities due to their active role in knee and hip extensionflexion movements (Archer et al., 2016;Brown, Whitehurst, Gilbert, & Buchalter, 1995;Finlay & Friedman, 2006;Lutterbach-Penna et al., 2014;Maulit et al., 2017;Ruas, Brown, & Pinto, 2015;Ruas, Minozzo, Pinto, Brown, & Pinto, 2015;Wilhelm et al., 2013).

Participants
Ten healthy collegiate men (age mean ± SD = 23.4 ± 2.2 yrs, mass 71.7 ± 11.7 kg, height 1.73 ± 0.06 m) volunteered to participate.Sample size calculation was performed with G*Power 3.1 (Institute for Experimental Psychology, Dusseldorf, Germany) based on sample sizes of previous ultrasound reliability studies of thigh muscles (Caresio et al., 2015;Palmer et al., 2015).Participants had no previous musculoskeletal injuries, and were not currently involved in any progressive resistance or endurance training program.Before participation, all participants provided written informed consent approved by the California State University Institutional Review Board Committee, Fullerton, CA, USA (approval number HSR-14-0454).

Procedures
Measurements were performed with participants in a supine position with both arms and legs extended and relaxed.They laid in this position for 10 minutes for stabilization of normal body fluids (Pinto et al., 2014;Rech et al., 2014).Three consecutive images were made in the transverse plane of the quadriceps RF, VI, VL and VM, and hamstrings BFlh, ST and SM muscles of both the right and left legs of each participant using a real-time ultrasound on B-mode on two consecutive days (Figure 1) (Palmer et al., 2015;Pinto et al., 2014).Images from previous studies testing reliability and validity of thigh muscles by ultrasound, computer tomography and MRI (Abe et al., 2016;Caresio et al., 2015;Noorkoiv et al., 2010;Palmer et al., 2015;Radaelli et al., 2013b;Worsley, Kitsell, Samuel, & Stokes, 2014), were used as references for identification of quadriceps and hamstrings muscles in our ultrasound images.An experienced researcher in ultrasound assessments performed all measurements.All images were clear with identifiable muscle borderlines, so that the ROI could be selected with as much of the muscle as possible, without including any surrounding fascia, as suggested by many studies (Caresio et al., 2015;Noorkoiv et al., 2010;Palmer et al., 2015;Rosenberg et al., 2014).All analyses presented regularity, not varying from image to image.
Quadriceps images were measured first followed by hamstrings with participants in a prone position.These positions were to ensure they maintained their legs extended and muscles relaxed during all measurements (Caresio et al., 2015).A water-based gel was used between the skin and transducer in order to ensure acoustic contact and reduce risk of misinterpretation of images due to pressure of the transducer (Caresio et al., 2015;Pinto et al., 2014;Radaelli et al., 2013a;Rech et al., 2014).The anatomical site for all measurements was at 50% of the distance between the lateral condyle and greater trochanter of the femur, except for the VM which was at 30% (Pinto et al., 2014;Rech et al., 2014).The transducer was placed perpendicular to the leg at the largest diameter of the muscles (Caresio et al., 2015).Transparency film was used to map the skin to ensure measurements matched between days (Radaelli et al., 2013a).

Ultrasound equipment
A portable B-mode ultrasound device (GE LOGIQ TM e, GE Healthcare, WI, USA) with linear-array transducer (code 12L-RS, variable frequency band 4.2-13.0Mhz) was used for all measurements.Settings for gain (52 dB), depth (9 cm), and frequency (12 MHz) were maintained for all images.These were optimized for quality (Palmer et al., 2015).All images were saved and exported for analyses in ImageJ software (Version 1.48v, National Institutes of Health, Bethesda, MD, USA).

Muscle thickness measurements
MT values were measured as the widest distance between the adipose-muscle upper interface and the lower interface for all quadriceps and hamstrings muscles, except for VI MT, which was measured as the widest distance between the adipose-muscle upper interface and the bone (Palmer et al., 2015;Rech et al., 2014) (Figure 1a).Distances were measured using the straight-line function of the ImageJ software.The average of three MT measurements was calculated for each muscle

Muscle cross-sectional area measurements
Muscle CSA values of each muscle were measured by using the polygon function of the ImageJ software, surrounding the muscles without including fascia or bone (Jenkins et al., 2015;Palmer et al., 2015).The settings for depth and field of view allowed the entire muscle CSA measurements for RF, ST and SM.For the other muscles, the ROI including as much of the muscle as possible was used (Caresio et al., 2015;Palmer et al., 2015) (Figure 1b).The average of three CSA measurements was calculated for each muscle.

Muscle echo-intensity measurements
Muscle EI values of each muscle were measured by grayscale analyses using the histogram function of the ImageJ software with the same polygon preselected ROIs used for CSA measurements.EI measurements were expressed in values between 0 and 255, where black = 0 and white = 255 (Caresio et al., 2015;Radaelli et al., 2013a;Rech et al., 2014).The average of three EI measurements was calculated for each muscle.

Statistical Analyses
Intraclass correlation coefficients (ICCs) (1,1) were used to calculate test-retest reliability for MT, CSA and EI of each muscle.This ICC model was used since the results were from single scores from each participant for each trial (Weir, 2005).Values between 0.00-0.25 were considered as having no reliability, 0.26-0.49low reliability, 0.50-0.69moderate reliability, 0.70-0.89high reliability, and 0.90-1.00very high reliability (Munro, 2005).All data were expressed as means and SD and analyses were performed with SPSS 20.0 (Statistical Package for Social Sciences, Chicago, IL, USA).An a-priori alpha level of 0.05 determined statistical significance.

Results
The means and SD, ICCs, standard error of measurement (SEM, absolute reliability), and minimum difference to be considered "real" (MD) for MT, EI and CSA of the quadriceps RF, VI, VL and VM, and hamstrings BFlh, ST and SM muscles are presented in tables 1 and 2. ICCs for MT, EI and CSA for all muscles were in a range from 0.97-0.99,0.83-0.88,and 0.86-0.97for quadriceps muscles, and 0.93-0.99,0.74-0.90, and 0.89-0.96for hamstrings muscles (p<0.05).

Discussion
The aim of this study was to determine the test-retest reliability of MT, EI, and CSA measurements in the quadriceps and hamstrings muscle groups, including specifically the BFlh, VI, ST, and SM.Our results demonstrated very high reliability for MT, and high to very high reliability for CSA and EI for all muscles.This suggests that both quadriceps and hamstrings ultrasound images can be used as reliable measurements for assessing muscle size, quality and area.
We found that ICCs for MT were in a range of 0.97-0.99 for quadriceps, which is considered very high test-retest reliability.This is in accordance or slightly greater than previous studies that assessed quadriceps muscles individually in young men and women (Cadore et al., 2014), elderly men (Cadore et al., 2012), or summed in elderly women (Pinto et al., 2014;Radaelli et al., 2013b).However, the highest reliability we found was 0.99 for VI.This is greater than the ICCs of 0.90 and 0.92 found by studies of Radaelli et al. 2013a andCadore et al. 2012, respectively, in elderly participants.The reason for this discrepancy may be related to our younger population.Arts et al. in 2010 reported that age could contribute to differences in quadriceps MT after testing a cohort that included participants between 17-90 years.We are not aware of other reliability studies that have tested VI MT in young adults.Also, differences in ultrasound adjustments for depth and gain may have contributed to the small discrepancy in results we found with previous studies.
To the best of our knowledge, few studies have investigated test-retest reliability of hamstring MT using perpendicular scans (Palmer et al., 2015;Thoirs & English, 2009).Thoirs and English (2009) found a reliability range of 0.79-0.84 in the hamstrings, but the studies did not measure hamstrings muscles separately.Palmer et al. in 2015 reported reliability between 0.89-0.97for hamstrings MT in men, which is slightly less than our 0.93-0.99range.We found that the greatest reliability in hamstrings was for BF, followed by ST, and SM muscles.The hamstrings have been reported as a difficult site to be visualized (Thoirs & English, 2009), and the BF as having a greater muscle size and less muscle fat compared to ST and SM (Palmer et al., 2015).This is in agreement with our analyses, and may explain the greater reliability and clarity we found for BF compared to ST and SM.These muscles were smaller and more blurry to visualize.
Increased muscle EI may represent intramuscular fat and fibrous tissue (Arts et al., 2010;Caresio et al., 2015;Young, Jenkins, Zhao, & McCully, 2015).However, Caresio et al. in 2015, when comparing different upper and lower limb muscles, found that EI reliability may be increased depending on the size, shape and position of the ROI measured.Palmer et al. in 2015, when investigating young men and women from 21-24 years, also suggested that hamstrings BF, which had the largest ROI, had increased EI reliability compared to the small ROIs of ST and SM.This is in agreement with our study, since VM and BF, which had the largest ROIs, also presented the greatest EI reliability values.Although our EI results match these previous studies, the EI reliability related to ROI size may affect the consistency of EI measurements, and calls into question if the greyscale values precisely represent valid markers for intramuscular fat and fibrous tissue (Caresio et al., 2015).EI results are often underestimated due to variability caused by participants with increased intramuscular fat, which may affect sound absorption and reflection of echo signals (Young et al., 2015).Furthermore, our results also show that EI reliability values were generally less than CSA and MT for almost all muscles, except VI, where CSA was greater than EI.
We found high to very high reliability for quadriceps and hamstrings CSA, ranging from 0.89-0.97.These results are in agreement with previous studies using panoramic images of the thigh muscles (Noorkoiv et al., 2010;Palmer et al., 2015).However, our study used single transverse images, which measure the entire muscle CSA.Nevertheless, Jenkins et al. in 2015 reported that there was a high correlation between single transverse and panoramic MT, EI and CSA of the biceps brachi muscle, allowing for the use of MT to quantify CSA in muscles with large total ROIs.We are not aware of studies that have done this type of association for quadriceps and hamstrings.Although panoramic imaging has recently been used to quantify muscle CSA, it require proper positioning and angling of the probe, since large muscle curvature may lead to low reliability for this procedure (Jenkins et al., 2015;Noorkoiv et al., 2010;Palmer et al., 2015;Rosenberg et al., 2014).Single transverse images may be a simpler and more accurate time efficient technique for quantifying muscle size (Jenkins et al., 2015).Future studies are needed to determine reliability between single transverse and panoramic CSA of thigh muscles for comparison.This study may have a couple of limitations.First, since we tested both legs in 10 participants, this may have led to an increased relationship in indices of muscle morphology between right and left legs.The best solution to avoid this risk could be to assess legs individually.However, bilateral differences in muscle morphology between legs have previously been reported (Mangine et al., 2014;McCreesh & Egan, 2011), and legs have previously been matched for ultrasound measurements of the thigh (Giles, Webster, McClelland, & Cook, 2015).Therefore, to avoid any risk we also assessed right and left legs separately.Our results showed similarly high to very high ICC results for MT, EI, and CSA.While the right leg was in a range of 0.92-0.99(MT), 0.74-0.93(EI), 0.87-0.97(CSA) for quadriceps and 0.81-0.99(MT), 0.82-0.93(EI), 0.87-0.97(CSA) for hamstrings, the left leg was in a range of 0.97-0.99(MT), 0.78-0.85(EI), 0.77-0.97(CSA) for quadriceps and 0.97-0.99(MT), 0.71-0.94(EI), 0.96-0.99(CSA) for hamstrings (p<0.05).Second, since VI was the deepest muscle measured, echo beam attenuation may have occurred and affected the EI results for this muscle (Caresio et al., 2015;Young et al., 2015).Young et al. in 2015 suggested the use of calibration equations to diminish independent influence of subcutaneous fat thickness in deep muscles, although specific ultrasound settings have to be used, and reliability, validity and applicability of these equations still need to be tested by future studies.However, we are not aware of any other study that has measured EI of the VI muscle for comparison to our results, and specific calibration equations may have to be developed for this and other deep muscles, due to differences in muscle morphology and composition (Young et al., 2015).

Conclusion
To our knowledge this is the first study to determine reliability of all quadriceps and hamstrings muscles.Our results suggest that both quadriceps and hamstrings ultrasound images have very high reliability for MT, and high to very high reliability for CSA and EI for all muscles.Therefore both quadriceps and hamstrings ultrasound images can be used as reliable measurements for thigh musculoskeletal tissue assessments in size, quality and area.

Figure 1 .
Figure 1.Examples of ultrasound images for quadriceps vastus lateralis (VL), rectus femoris (RF), vastus intermedius (VI) and vastus medialis (VM), and hamstrings biceps femoris long head (BFlh), semitendinosus (ST) and semimembranosus (SM) muscles.a Examples of the widest distances between the adipose-muscle upper interface and the lower interface of muscle or bone for muscle thickness (MT) measurements of each muscle.b Examples of the regions of interest (ROIs) including as much of the muscle as possible, without fascia or bone tissue, for cross sectional area (CSA) and echo-intensity (EI) measurements of each muscle.