Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands

Background: American football and the athletes that participate have continually evolved since the sport’s inception. The fluidity of the sport, as well as the growth of the body of knowledge pertaining to American football, requires evolving training techniques. While performance data is being garnered at very high rates by elite level sports organizations, the limiting factor to the value of data can be the limited known uses for the data. Objective: This study introduces a technique that can be used in tandem with data collected from wearable technology to better inform training decisions. Method: The K-means clustering technique was used to group athletes from two seasons worth of data from an NCAA Division 1 American football team that is in the “Power 5.” The data was obtained using Catapult Sports OPTIMEYE S5 TM in games played against only other “Power 5” programs. This data was then used to create average game demands of each student-athlete, which was then used to create training groups based upon individual game demands as previously mentioned. Results: The resultant groupings from the single season analyses of seasons one and two showed results that were similar to traditional groupings used for training in American football, which worked as validation of the results, while also offering insights on individuals that may need to consider training in a non-traditional group based upon their game demands. Conclusion: This technique can be brought to `athletic training and be useful in any organization that is dealing with training multitudes of athletes.


Problem Identification: The original Three Groupings and Evidence of Change in American Football
In 1997, Pincivero and Bompa recognize that, "A basic understanding of the physiological systems utilized in the sport of football is necessary in order to develop optimal training programmes geared specifically for preparation as well as the requirements of individual field positions." They recognized position-specific demands will aid in optimal training when they identified three player categories: linemen, backs and receivers, and linebackers. Pincivero and Bompa lay out the differences in size, body composition, strength, speed, and endurance as well as demands specific to their role during the game (Pincivero & Bompa, 1997). These classifications are similar to the training groups that are observed in collegiate football strength and conditioning circles today, often being referred to as "bigs, skills, and big-skills" (Sierer, Battaglini, Mihalik, Shields, & Tomasini, 2008). The report by Pincivero and Bompa was written in 1997. Table 1 demonstrates how the capability-and, therefore, the game IJKSS 8(2):47-63 The first number represents the number of running backs, the second number represents the number of tight ends, and the total of the two numbers can be subtracted from 5 to find out how many wide receivers were on the field. The changes to American football and the evolution of elite athletes have created new positions and thus created similar evolutions in the game demands relative to individual roles. Therefore, what is the best way to group individuals to personalize training based on what the individuals will be required to do in game situations while keeping trainee to trainer ratio low?

Contextualization: Wearable Technology in Football Allows for Objective Observation Data that Quantifies Actions
Using the "bigs, skills, and big-skills" groupings for individualized training for a football organization is valid and is still used today, but due to the uptick in wearable sports tech availability in elite-level athletics (Luczak, Burch, Lewis, Chander, & Ball, 2019), there is now objective data that can provide insight to ac-tual game demands of elite-level athletes on the field. This data, when combined with careful analysis and professional physiological knowledge, can be used to supplement decision making within the realms of athletic training (Bourdon et al., 2017). Teams can spend millions of dollars on wearable technology, attempting to mitigate injuries (Hanuska et al., 2017) and the current perception is that wearable data can be used along with physiological expertise to minimize overuse, non-contact injuries (Valovich McLeod et al., 2011). The availability of wearables is somewhat recent (Luczak et al., 2019). In 2013, Hynes et al. observed that there was little access to technologies that we now know as wearables, but he predicted that there would be an influx in the coming years (Hynes, O'Grady, & O'Hare, 2013). Since this time the demand of wearables has increased and with it, so has the discovered avenues for use (Burch, 2019;Creasey, 2015;Steinbach, 2013;Wright, Smart, & McMahan, 1995). The relative novelty of wearables still leaves the industry in a world of untapped potential regarding innovative functions for the data that wearables provide.  personnel which implies a spread-out style of offense has become more commonplace. Conversely, 21 and 22 personnel with multiple running backs in the backfield and be indicative of a less spread out style of offense that can rely of strength rather that space and speed (Schofield, 2014). The number of quarterbacks and linemen remain constant and therefore are not included in personnel naming Using a clustering technique along and the objective physiological metrics along with provided context and analysis from strength and conditioning professionals will create informed training groups that will best maximize the strength coaches time and the athlete's training. Additionally, the clusters could shed light on what game demands look like for positions that might not always fit in the same group such as linebackers. One way to group the data is by using a clustering technique. K-means clustering is a clustering technique that is used to find an optimal number of centers (K) that relate to the data set in such a way that the distance between the centers and the data points is minimized (Wagstaff, Cardie, Rogers, & Schroedl, 2001). Thus, providing groups based on any number of variables where objects in the same group are as similar as possible and objects of different groups are dissimilar as possible.

Study Purpose: Using K-Means Clustering to Group Competitive Athlete's for Training Purposes
The purpose of this study is to lay out a method that can help inform decisions when individualizing training for large groups of athletes that may have varying game or job demands. This will benefit athletes and training practitioners as the resultant groups will aid in identification of the athletes needs based on their "in-game" physical demands. Trainers can have more evidence and, therefore, more confidence in the ways in which they individualize their athletes' training, while still accounting for the fact that there is sometimes a disproportionate amount of athletic/strength training professionals to athletes, which necessitates the use of training groups. Athletes will receive training that is more beneficial to them.

Participants and Design of Study
This descriptive study utilized the partnership with an NCAA Division 1 American football program, who will be referred to as "Team_X," for the purposes of this article. Catapult Sports OPTIMEYE S5 TM (Melbourne, Australia) was worn by the first-and second-string student-athletes on Team_X. When initially exploring the data, inclusion criteria clearly needs to be added to get optimal results. Players that have low distances, loads, or inertial movement analysis (IMAs) just because they did not participate in as many snaps, or singular plays, as their teammates, do not need to be considered. Their group would not be relevant to the training classifications, because positions need to be trained for full game demands as opposed to partial. All players will not participate in a substantial amount of snaps, but all players must be prepared for the possibility of participating in a substantial amount of snaps. Through a paid prescription to PFF.com, one can extract the snap counts of every player for every game. From speaking with collegiate strength and conditioning coaches, a minimum of 25 snaps in a game should be required for a player's game data to count towards the "game demands." Additionally, only games that are against similar "elite" level talent should be considered. Team_X is a "Power 5" school. In order to exclude games where athletes have a higher likelihood of competing against oppositions that are of notably lesser ability than themselves, only games that were played against other "Power 5" schools will be included in the data. "Power 5" refers to the NCAA schools from the traditional power five conferences, Southeastern, Atlantic Coast, Big Ten, Big 12, and Pac-10 (Lindsey, 2006). These schools traditionally make up a majority of the competitively elite teams in college football. Finally, quarterbacks will not be included in the analysis. Because of the researcher's knowledge and experience with the Catapult data and a discussion with strength and conditioning coaches, the quarterback's unique position will certainly be an outlier and hamper the classification process. Research dealing with athlete classification using tracking tech is novel therefore there is no precedent for which variables to include. Leaning on discussions with a high level strength and conditioning staff, the researcher's experience collecting and analyzing the data, as wells as considering the limitations of what variables IJKSS 8(2):47-63 that are provided by the wearables, the variables being used in the study will be Max Velocity, IMA, PlayerLoad (TPL), distance ran 5 to 8 mph, distance ran 8 to 12 mph, distance ran 12 to 16 mph, distance ran 16 to 25 mph, and total miles ran from the Catapult Sports OPTIMEYE S5 TM and number of snaps from PFF.

Instrumentation
Catapult Sports OPTIMEYE S5 TM outdoor units were used to capture data. Each athlete would wear the device connected to his shoulder pads, placed between the shoulder blades, per Catapult Sports recommendation. PFF was used to supplement data by providing snap counts for each player. RStudio was used to perform data cleaning, assess the K-means clustering technique, create the visualizations.
For each player who donned the device, the Catapult Sports OPTIMEYE S5 TM tracked numerous variables including; max velocity, IMA, PlayerLoad, various distance intervals ran, and total distance ran for every game and practice. IMA is a count of intense movements that occur throughout each activity. They are measured with a combination of an accelerometer and a gyroscope (Julien, 2020a). Catapult Sports (Melbourne, Australia) explains PlayerLoad as follows, "PlayerLoad is the sum of the accelerations across all axes of the internal tri-axial accelerometer during movement. It takes into account instantaneous rate of change of acceleration and divides it by a scaling factor" (Julien, 2020b). Essentially, PlayerLoad is a measure of total external work done by a player during any activity. These measurements can quantify, objectively, a portion of the physical game demands experienced during any activity. These are not all-encompassing measurements when building a complete physiological profile of an athlete's actions during an American football game. The wearables did not have the ability to capture every physiological metric, but these measurements are the ones that were used by Team_X's strength and conditioning coaches when reporting on player performance.

Data Collection and Preparation
The data was uploaded to the Catapult OpenField Cloud TM after every activity. In the OpenField Cloud account, the reporting feature was used to extract the variables needed in a table format. From the report builder, a table was created for the purpose of exporting to .csv file. The rows were chosen to be grouped by athlete and activity while the parameters chosen were position name, activity name, max velocity, IMA, PlayerLoad, distance ran 5 to 8 mph, distance ran 8 to 12 mph, distance ran 12 to 16 mph, distance ran 16 to 25 mph, and total distance. The table was then exported to a .csv format and opened in excel. Each row that only served to identify the activity was deleted an additional parameter was added to every row as a column named "season," which either contained the value 2018 or 2019, in order to identify which, season each game belonged to. Following this process, the .csv file was saved. The only data missing was the PFF snap count data. The PFF data was exported to individ-ual files per game. The rows were tagged with player name and Activity ID's that matched the previously mentioned .csv file. The files were aggregated to one "Snap Count" file that contained the player's name, the number of snaps that the player participated in, and the activity ID. A relationship was built between the two sheets, linking on player name and activity ID, which in turn provided the ability to add the column that identified "Snaps" for every row on the initial table. This was done at the end of each season, and then the data was appended to the same file. Players names were easily replaced with playerID's, which included the position name and a random number, in a column named "Player-ID" for deidentification purposes using the find and replace feature of Excel. This could not work for the key identifier because multiple players played more than both the 2018 and the 2019 season. Therefore, when the data was imported to the RStudio workspace the key identifier was created by combining the PlayerID and Season columns using the following code: The researchers desired one value per season per category. The key being "PlayerID_season," where playerID is the player's identification code and season being either 2018 or 2019. Each data point was an average of each individual category through every game in which the player met the inclusion criteria in each respective season. For example, each row had an average PlayerLoad, Max Velocity, IMA, distance ran 5 to 8 mph, distance ran 8 to 12 mph, distance ran 12 to 16 mph, distance ran 16 to 25 mph, total miles, and snap count. Therefore, if there were 30 players with data in each season there would be a total of 60 rows with 9 columns, each cell being equivalent to a season's average of games where the player met the inclusion criteria. The inclusion criteria and data preparation discussed were executed in R using filter(), group_by(), and mean() functions in the code that follows: df <-na.omit(df) df <-filter(df, Snaps > 25, PositionName != QB) df <-select(df, PlayerID_Season, MaxVel, IMA, Player-Load, `5to8mph`, `8to12mph`, `12to16mph`, `16to25mph`, Tot_Distance, Snaps) %>% group_by(PlayerID_Season) %>% summarise(MaxVel = mean(MaxVel), IMA = mean(IMA), TPL = mean(TPL), `5to8mph` = mean(`5to8mph`), `8to12mph` = mean(`8to12mph`), `12to16mph` = mean(`12to16mph`), `16to25mph` = mean(`16to25mph`), Tot_Distance = mean(Tot_Distance), Snaps = mean(Snaps)) df <-data.frame(df, row.names = "PlayerID_Season") Following the data cleaning, there was an easy way to visualize the distance between each subject. First the data must be scaled using the scale() function.

Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands 51
The distance between each subject was then acquired using the get_dist() function. The code following used the fviz_dist() function to output a distance matrix utilizing a color scale to compare how close or far away players are from one another.

Data Analysis via K-means Clustering
Other clustering methods exist, such as density-based clustering and hierarchical clustering, but K-means clustering was ultimately chosen for this data set due to the need to evaluate the every single "point" in the data set, as opposed to only considering points or clusters nearby (Open Data Science, 2018). Density-based clustering does not consider all points in the dataset when creating its clusters like the k-means approach (Open Data Science, 2018). Density-based clustering considers data points that are in close proximity to each other while considering every other point as noise (Kriegel, Kröger, Sander, & Zimek, 2011). For this project, every individual athlete was to be placed in a group accounting for the similarities to every other subject in the dataset. Therefore, K-means was decided to be more useful than density-based clustering in this project. Clusters are created hierarchically by making smaller clusters and then associating those smaller clusters with others in order to get the desired number of clusters (Olson, 1995). However, similar to the downfall of density-based clustering, no information about other points is considered (Open Data Science, 2018). K-means clustering requires the distance between every point in the dataset in order to compare the relativity of each point. According to Shirkhorshidi et al., there are three distance measuring techniques that apply specifically to K-means clustering: Euclidean, Average Distance, and Manhattan (Shirkhorshidi, Aghabozorgi, & Ying Wah, 2015). This study will use Euclidean distance measurements along with scaling all variables so as not to allow the "largest scaled variable" to dominate the others. Euclidean and Manhattan distances are the most commonly used (Shirkhorshidi et al., 2015), and the Average Distance technique would minimize the effect of outliers, which was not a goal of this project. Additionally, Singh et. Al, which compared the use of Euclidean and Manhattan distances when perform the K-means technique, concluded, "the K-means, which is implemented using Euclidean distance metric gives the best result […]" (Singh, 2013). The Euclidean distance was used for this study for its ease of calculation and seemingly more common use in K-means clustering (Shirkhorshidi et al., 2015).

Totss
The total sum of squares.

Withinss
Vector of within-cluster sum of squares, one component per cluster.

Size
The number of points in each cluster.

Iter
The number of (outer) iterations. centers to using 7 centers, and the ones that closely resemble the traditional groupings were examined. The last step of the data analysis process, and one that will not be included in this study because of the need for deidentification of the data, was to review each of the groupings with the training professional in order to come to a final conclusion on the training group configurations.

RESULTS
The results will be presented in three subsections: one for each individual season when the data was collected and then one with both seasons data combined. This will provide the most complete measurement of the use cases and validity of the tool. The assumption was made that the system or offensive playstyle did not change at the particular Division 1 program (head coach was kept in place) that the data was recorded from, but because of the nature of collegiate football (only being allowed 4 years of eligibility) there is a high turnover rate amongst student athlete personnel. This is important because as an offensive system changes or personnel within the system changes, the game demands change as well. This makes the two years comparable, but also there is validity to analyzing them separately in order to account for differences in roles based on system or individuals.

Results from Season 1
Season 1 contained 30 student athletes with game data meeting the inclusion criteria. Figure 2 represents the Euclidean distances between every subject. The diagonal line of deeper blue boxes represents the participant being compared to himself (a difference of nothing). The red areas show that the most different groups are the DB and WR individuals as compared to the OL individuals, this makes sense as they have clearly different roles during a game. OLs are typically the largest individuals on the field and have requirements that are very strength-based that include a lot of close quarter combat without a lot of movement. Conversely, WRs have demands that are based in agility, speed, and running long distances. Figure 3 represents an example of how the groupings would look when using 3 centers, which is the traditional way of grouping athletes for training. Figure 4 displays the different groupings and how inclusive they are depending on the number of centers. The scatter points are in the same position on the graphs in Figure 4 as they are on Figure 3. Therefore, for reference, compare the two Figures to determine which groups contain which positions. As stated in the methods section, the visuals were created by PCA. The dimensions used in the PCA are titled in the X and Y axes of each plot. Dim1 accounts for as much of the variability in the dataset as possible (in season 1's case Dim1 accounts for 61.6% of the variance of the dataset). Dim2 has the highest variability possible while being orthogonal to Dim2. Figure 4 represents a cluster consisting of 4 DBs, 4 WRs, and 1 LB; a second cluster consisting of 4 DLs, 3 DBs, 2 RBs, 2 TEs, 1 WR, and 1 LB; and a third cluster consisting of 7 OLs, 1 DL, and 1 LB as seen in Table 3. Figure 5 represents how the parameters effect the two dimensions shown on the plot. Figure 5 shows that Dim1 is represented by variables that are influenced by distance and speed (max velocity and distance in speed zones). Maximum velocity and distance ran between 5 and 8 miles per hour is hidden under distance ran between 12 and 16 miles per hour with the vectors going in the negative x direction Whereas Dim2 is represented by IMA and the number of snaps taken during a game.

Results from Season 2
Season 2 contained 36 student athletes with game data meeting the inclusion criteria. The similarity matrix represented  in Figure 6 again shows linemen being similar and DBs and WRs being similar, but linemen being the most different from DBs and WRs. When using the traditional grouping method of 3 center for the dataset from season 2, the groups came out slight-ly different in form than in season 1. Figure 7 represents a group consisting of 3 DBs, 1 WR, and 1 LB (rightmost group in Figure 7); a second group consisting of 5 DBs, 5 WRs, 2 TEs, and 2 LBs (center); and a third group consisting of 8 OLs, 6 DLs, 2 RBs, and 1 LB (leftmost) as seen in Table 4.
These clusters differentiate themselves from the first season, but mostly because of the uniqueness of the 3 DBs in the most positive position (upper right corner) of Figure 7. Using 4 centers with season 2 provides more similar groups to season 1 while also providing another group for the unique student athletes. These results can be seen in Figure 8 and Table 5. Table 5 displays 4 groups, including a group that encompasses 3 DBs and a LB; a group that contains 5 WRs and 5 DBs; a group that contains 1 WR, both RBs, both Ts, all

IJKSS 8(2):47-63
DLs, and 3 LBs; and there is a group with exclusively all the offensive linemen. Figure 9 represents different numbers of centers (2 through 7). The scatter points are the same as the labeled points in Figures 7 and 8, therefore comparing Figures 7 or 8 to Figure 9 will give a good idea of how the groups change as the number of centers chosen increases. Figure 10 provides a graphical representation of the PCA analysis used to visualize these groupings. Similarly to the season 1 anal-ysis, most of the variance in the dataset can be explained by the x axis which is influenced heavily by variables that are related to speed and distance (distance in speed zones and maximum velocity); whereas dim2 is more heavily influenced by snaps participated in and IMAs. The difference being that the positive and negative directions have flipped.

Seasons 1 and 2 Results
Figure 11 displays a similarity matrix portraying findings consistent with those of Figures 2 and 6 in identifying that the wide receivers' and defensive backs' game demands are most dissimilar to those of offensive linemen. The visualizations become more crowded when considering all 66 student athletes that met the inclusion criteria. Figures 12 and 13 display the clusters while Tables 6 and 7 compares the inclusion of position groups between the two cluster amounts.

Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands 55
Using only 3 centers creates a group that contains every OL, all but 1 DL, half RBs, and over half of LBs. This is clearly the "bigs" group but deciphering between traditional "skills" and "big-skills" groups is a little bit harder. One can assume that because of the DL, TEs, and RBs in group number 3 that this could be the "big-skills" group.
Using 4 clusters creates groups with clearer differentiating lines. Table 7 portrays that using 4 clusters creates a cluster, referred to as cluster 1 in Table 7, that includes RBs, TEs, 4 DLs, a LB, 1 WR, and 3 DBs. This seems like a more appropriate "big-skills" group. While clusters 3 and 4 seem to form a split "skills" group, similar to the group created in season 2 when 4 clusters were used. The overlapping of clusters 2 and 3 in Figure 12 is a demonstration of the difficult nature of portraying complex groupings that consider numerous variables like this study. There are more dimensions present that just the two displayed on the scatter plots. This is why clusters 2 and 3 overlap in Figure 12. When considering all the dimensions, DB_5395, DB_3764, and LB_8472 are included in cluster 3 because they minimize within cluster variation when considering all of the variables despite how this may seem counterintuitive based on Figure 12's portrayal of the results.

IJKSS 8(2):47-63
The PCA for the seasons 1 & 2 dataset looks different than that of the first two seasons. Dimensions used to graph the data in Figures 12 and 13 do not explain as much of the variance as the Figures representing only the first two seasons. PC1 still accounts for over half of the variance and includes mostly distance and speed related variables as shown by Figure 14.

Season 1
While using the k-means clustering method is nothing new for sports, a detailed literature review search found noth-ing specific to its application in American football for the purposes of identification of training groupings. Soccer, on the other hand, has had a number of recent studies utilizing k-means clustering for everything from pregame expectations of the athlete (Popovych et al., 2020), to special movement patterns during a game (Beernaerts, de Baets, Lenoir, & van de Weghe, 2020), to game performance as it relates to new contracts (Gómez, Lago, Gómez, & Furley, 2019), to self-determination (Sarmento, Peralta, Harper, Vaz, & Marques, 2018) and emotional intelligence (Louvet & Campo, 2020), to the risk of eating disorders (Izquierdo, Ceballos, Ramírez Molina, Vallejo, & Díaz, 2019). One soccer study did find that k-means clustering was not a good technique for assessing the four velocity zones during a match because of the subtle differences between velocity thresholds based on the Catapult Sports MinimaxX S4 TM used (Park, Scott, & Lovell, 2019). Still, no study was found specifically for the purposes of classifying athlete game loads into groupings that the strength staff could use for more specific training. Other studies looking at sports, such as basketball and the National Basketball Association (NBA), used k-means clustering to attempt to predict the outcome of games (Cheng, Zhang, Kyebambe, & Kimbugwe, 2016) and, while the authors of this study believe that the Catapult data did show some predictive capability based the researchers' extreme familiarity with the student-athletes and the coaching staff, this was still not the intent of this particular study.
For this study, the clustering results from season 1 show three groups that are very similar to the traditional groupings used today and those presented in the existing literature (Pincivero & Bompa, 1997;Sierer et al., 2008). However, there is one difference in that there is 1 LB in every group. Without identifying the data this study cannot analyze the playstyle or role within the defensive system of each of these linebackers, which is not vital in introducing the technique as valid but would complete the final analysis in the real-world appli-   Description: This table displays the representation from each position group in the three clusters as represented graphically in Figure 12.

Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands 57
have a role that does not require running large distances but includes more close quarter combat with opponents (Reid et al., 2020). These two players, although they may share a classification and may even share similar body types, their game requirements would benefit from different training modalities.
Similarly, in the season 1 data, we see 1 WR and 3 DB in the group that resembles the traditional "big-skills" group. Again, there is no way of analyzing this specific data because of the deidentification, but in application, professionals can analyze why these players fall in this group. The reason could be that the student-athletes are not required to run as fast or as far as the individuals in the "skills" group (Reid et al., 2020). Alternatively, the classifications could mean that the WR or DB is not capable of running at the speed of the other individuals. This could lead to an intervention program to develop the player to meet the standard that is required for the positional role demands of the team's system.
When examining k=4 from Figure 4 one can see that the groups remain the same except for what was known as cluster 2 from Figure 3. This k=4 cluster had positions that would be considered "skills" group individuals and is broken into two separate clusters. The cluster in the upper left of Figure 4 where k = 4 contains all but one of the WRs, a DB, and a TE. The other cluster contains 3 DBs, 2 LBs, and a WR. Additionally, the "big-skills" group gets smaller by the reclassification of the WR and TE that were a part of cluster 3 (middle cluster) when only 3 centers were used. This suggests that had the strength & conditioning professionals been interested in using 4 groups to increase individualization in training that the group that contained the most within cluster variation is the cluster being compared to the traditional "skills" training group. The resultant splitting of the "skills" groups naturally separated WRs and DB and added 2 of 3 LBs to the cluster containing DBs, while one TE was added with the WRs. cation. In a real-world application, the coaching practitioners using the identified data could analyze the student-athletes and then make decisions on how to train accordingly. If the linebacker's role in the defense is the reason for the different groupings, then the linebacker needs to train in a group that more closely resembles the training that his game demands require. For example, a linebacker with a higher capacity for speed and agility may get deployed in a role within the defensive system that requires high volumes of distance and unimpeded accelerations. Whereas, another linebacker with more size and a higher capacity for power is more likely to

IJKSS 8(2):47-63
text from identification of the data could make a statement on the playstyle or game demands of the RBs and/or DLs on this team. This, in-turn, could aid trainers when planning training regimens for these positions. The possibility of these two position groups being clustered together is not improbable, but it does warrant a closer look. The playstyle of the individual at these positions undoubtedly has a large effect on how they are clustered. This brings clarity to the idea of how different playstyles can affect the demands of positions (Fullagar, McCunn, & Murray, 2017), which is evidence that this tool could aid in  Description: These graphs display groupings based on the numbers of clusters that were assessed from seasons 1 and 2 The two resultant "skills" clusters were interesting as it suggests that the technique differentiated "covering," a job performed by DBs and sometimes LBs, and "route-running," a job performed by WRs and sometimes TEs, before it differentiated the demands of RBs and DLs in the "big-skills" group of k=3 displayed in Figure 3. This differentiation does not emerge until the dataset is clustered using 6 centers. This leads to the assumption that whatever role the RB was playing must have involved similar demands to some of the DLs. This assumedly would not always be the case, but with added con-Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands 59 classification of individuals for training groups, especially at positions that are so diverse, such as DLs, RBs, TEs, and LBs.

Season 2
When clustering into 3 groups for the season 2 data the groups do not fit the traditional mold (Pincivero & Bompa, 1997) as well as the dataset from season 1. The "skills" seem to get differentiated into 2 groups and then the "bigs" and the "big-skills" are lumped into 1 final group. When observing the clusters using k=3 in Figure 7, there is a very clear differentiation of the 4 individuals in the uppermost right-hand corner of the graph from the rest of the individuals. We cannot identify the data, but in this case, Figure 10 portrays that individuals in the upper right corner of the graph were heavily differentiated by TPL, which represented the PlayerLoad, a measure of total external work done by a player during any activity, of the individual in the dataset. One could infer that these individuals were differentiated because of the workload demanded by them in a game as compared to others (Ward, Ramsden, Coutts, Hulton, & Drust, 2018). However, when using four clusters, the players with largely positive PlayerLoads get contained in their own group. Consequentially, the leftover groups become more traditional (Pincivero & Bompa, 1997). Table 5 displays a "bigs" group (cluster 3), a "big-skills" group (cluster 2), and two "skills" groups (clusters 4 & 1). Cluster 4 contains "skills" with the highest workloads. This could represent a need to train the two differentiated "skills" groups uniquely (Wellman, Coad, Flynn, Climstein, & McLellan, 2017). Once this type of information is discovered it is up to the organization's professionals to make decisions. This may spark the question, "Are these specific individuals being asked to do too much during games?" Assuming these players need to bear an enhanced workload in order to put the team in the best position to succeed, these players may need to train in a unique group in order to put them in the best position to succeed from a conditioning perspective. This points back to the importance of professionals being involved to add context. This is not a tool to answer all questions, but rather a technique to better inform decision-making of professionals.
Using 4 centers produces groups that closest resemble the traditional groupings (Pincivero & Bompa, 1997), Table 5 displays that positionally clustered anomalies are fewer than in season 1. One LB is not clustered with the other three in the "big-skills" group. It is important to note that the 1 stand-alone LB was clustered with the "skills," in cluster 4, the cluster that contained all the high workload individuals. Which brings up the same workload questions that were discussed earlier. One WR is not grouped in one of the "skill" groups. Additionally, 1 DL was not clustered with the other 7 DLs in the "big-skills" group. With the added context of identification and American football expertise of organizational professionals, this information could aid in decision-making processes. The WR could be a large individual and used in similar ways to a TE. If this individual does not already train with the "big-skills" group, and usually trains with the other receivers in the "skills" group, this evaluation may inspire a change in the training group. Even simple clarifications such as where on the defensive line each of the DL individuals play would aid in answering the question of why 7 out of 8 of these DLs were grouped with the "big-skills" cluster in Figure 8 (k=4). When considering season 1 data, the researchers just assumed that the 4 tracking units were on defensive ends, which, as discussed in the introduction, are faster and have different demands now than previously, while the 1 DL that got grouped in with the "bigs" was a defensive tackle and was not required to move as much or as fast as the other 4 DLs (Reid et al., 2020). Now that there are 8 total DLs and still only 1 of them groups as a "bigs," this assumption seems to become a little less safe. If some of those DLs are defensive tackles, the following questions could be asked: "Do these individuals need to be trained differently because their game demands seem to reflect closer to a big-skill than a big?" "Are their game demands more representative of their extended abilities or their role that they are being asked to play within this organization's defense?" "If those individuals have the capabilities of a defensive end, can we expand their role within our defensive system to make the team better?" and "Does training these individuals in a different group with an individualized training program aid them in becoming the players that the organization now believes that they can become?" Again, this places emphasis on the needed clarification that this tool needs context and added professional opinion to answer these questions, but it can be useful in forming questions that may have not been present originally, thus aiding in the quest for optimal usage of the organizations resources.
Similar to the season 1 dataset, season 2 provided results that mostly aligned with the traditional groupings (Pincivero & Bompa, 1997). This time 4 clusters were needed to make the clear connections to the traditional groups. These similarities between the technique's clustering and the groupings offered by Pincivero and Bompa (Pincivero & Bompa, 1997) are encouraging because the traditional groupings are already being used by organizations and being cited by literature. Again, like the season 1 analysis, the k means cluster were not so similar that they offered no interesting information. In contrast, the technique brought differences to light and helped form important questions for the coauthors who serve in the role as coaching practitioners. Thus, this method offers value as an additional decision-making tool when forming training groups.

Seasons 1 and 2
Using 2 seasons worth of data together did not bring forth clear representative clusters that resembled traditional "bigs, big-skills, and skills" groupings (Pincivero & Bompa, 1997) like the data from the individual seasons. A clear differentiation still exists between the group of WRs and DBs and the group of OLs and DLs as noted in previous studies (Fullagar et al., 2017), but the other positions do not seem to split up as uniformly. Using 4 clusters aids this dilemma by creating groups that can be loosely identified in Table 7 as "bigs" (cluster 2), "big-skills" (cluster 1), and "skills" (clusters 3 and 4). Alas, this identification of groups leaves room for doubt as WR_6348, RB_9949, RB_5208, TE_8623, and TE_8449 would get put in the "bigs" group. This does not fit the general mold set forth by the stand-alone analysis of season 1, season 2, or the pre-existing traditional grouping (Pincivero & Bompa, 1997).
Interestingly, every one of the individuals mentioned above (WR_6348, RB_9949, RB_5208, TE_8623, and TE_8449) as unusually classified comes from the season 2 data set. Further, All WRs in cluster 4 are from season 1 data set, while all WRs in cluster 1 are from the season 2 dataset. These are all offensive positions and correlate directly with coaching staff changes that occurred between seasons. This seems to present reason to believe that the seemingly misidentified clusters could be representative of change in an offensive system from one year to the next. Meaning the positions remained the same, but the game demands of the position or the training of those individuals were altered. Alternatively, the apparent change in data from one season to the next, particularly from the offensive positions, could be representative of new players. The changes are most likely an extension of both reasons, new players and altered roles within the offensive system, but there is no way to know for sure without further added context which would require player identification. Regardless, this seems to suggest that the technique works best for within season grouping and suggests that new playstyles will significantly alter game demands.
The affect held by which season the data is from seems clear, but the differences between an RB and an OL seem like they should be enough for the k-means technique to differentiate, even between seasons. However, RB_9949 (from season 2) is grouped with OLs all the way through k=7 as seen in Figure 13. Additionally, RB_5208 at k=7 is grouped somewhat more appropriately with DLs, a TE and two LBs, but the group is still compromised of a lot of OLs. In the single season analyses of seasons 1 and 2, using clusters equal to 3 or greater OLs were consistently contained within a group that differentiated itself from positions like RB. In contrast, when combining the seasons together, the group that would easily be classified as "bigs," contained positions that traditionally fit into other training groups. This leaves room for reasonable doubt about the technique, specifically when using data through multiple seasons. Further detail providing the context of identification could confirm and answer some of these concerns. A TE could be used mostly for blocking and therefore, his game demands be like that of an OL. The same could be said for an RB if the player was mostly used for blocking such as a traditional fullback, but the fact still remains that these same players were grouped more ideally when looking only at the athletes that participated in season 2. Within the context of sports, the technique worked well to assess season by season and between seasons, but due to changes in players, there was less value in looking at combined seasons.

Application for Strength and Conditioning Coach Practitioners
In the example of Team_X, the engineers working with the coaching staff were the ones to adopt and calculate the k-means clustering outcomes. But in order to apply the technique, strength and conditioning coaches working alongside the engineers can evaluate the reported clusters and, using their expertise, consider how many athlete training groups (k-means clusters) draw the most similar demand, as well as take into consideration how many groups is feasible for the training team to handle. Additionally, players that might seem to be "between groups" (i.e. an athlete who could train with the "bigs" or the "big-skills"), will now have a mathematically suggested group based purely upon the game demands collected from wearable technology.
Combining Team_X's Catapult tracking data from individuals' games and the k-means clustering technique could provide valuable insights on training groupings that are more relevant to the current state of both collegiate football and the NFL. While the technique is the most important part of this study (for the engineers), the protocol of this study could be used for any American football team to take into consideration the unique playing style of individual offensive and defensive schemes. Playstyle in American football dictates athlete demands, therefore demands will be unique for each team. The results of this type of data and this analysis technique will create insights specific to the individuals whose data were collected. Further, context matters! The measurements being used are not all-encompassing when building a complete physiological profile of an athlete's actions during an American football game. The nature of American football introduces so many variables that account for all of them seems nearly impossible-even for the coaching practitioners who are members of this research team. The Gatorade Sports Science Institute published an in-depth article outlining the demands of American football. That report shares some of the same contextual variables detailed within this study. They included team play style, playing surface, temperature, positional differences, physical capabilities, quality of the opponent, technical qualities, etc. (Bangsbo, 2014). A complete set of data for every game and practice that would cover all these contextual factors would prove too time consuming and costly for any team. Therefore, expertise from experienced American football strength and Using K-means Clustering to Create Training Groups for Elite American Football Student-athletes Based on Game Demands 61 conditioning personnel is needed to supplement the data that the physiological metrics can provide. While wearable data can provide absolute measurements for specific variables, careful thought is needed when deciding what understandings the data actually provides. The insights gleaned from this technique will not be strong enough to base decisions on alone, but with insights brought about by coaches, they can be used to influence decisions around programming for strength and conditioning teams.

Limitations and Future Research
The noted lack of identification was a major limitation of this study. Being able to associate more specific roles, playstyles, and physical attributes to individuals would have unlocked a lot more analyses on the clusters that were created. As is, the research was able to suggest possible reasons for certain clustering results, but nothing could be said for certain. Additionally, the technique did not perform well when used on the combined seasons dataset as compared to the single season results. Possible reasons offered for this were a difference in playstyle or personnel of the team. This could be further examined if context were provided. Additionally, this could be examined by using the technique in a setting where these variables were controlled from one season to the next. The personnel would have to be the same and the playstyle of the players and the team would have to remain the same.
The variables used in this study could also be considered a limitation. Most of the variables involved distances, speed, and general workload. The argument could be made that these variables did not provide the entire picture of what an athlete's demands are during a game. The variables in this study were used because they are what were provided by Team_X. Additionally, limitations on the parameters that can be used are placed on every organization based upon what wearables are being used. This does not mean that one cannot get a good estimation of game demands, but these limitations require the needed context when observing and making actual decisions based on the data collected by the wearables. Future research projects can be used to overcome the limitations discussed earlier. Because the data's value, in part, comes from the fact that it is real game data it is difficult to remove context from the data. Therefore, the analysis/discussion must account for it. Still, a study could be done within the context of a team that has minimal personnel turnover as well as a consistent playstyle. More research could also be done to analyze the effect that different tracking variables has on the resultant groups. Because this project was used to introduce a technique, and comparison to accepted training groupings was used as the validation technique, quantity of instances where the resulting clusters are compared to the traditional groupings will enhance the validation of the technique as a commonplace tool used in athletics. Additionally, interviewing strength & conditioning coaches after and during implementation of the technique can evaluate the usability as well as the validity of the technique in real world application.
A more thorough and extensive research project could be done using single games to analyze the differences based upon opponent played. This research used an aggregate while also using criteria to obtain an understanding of general game demands versus similar strength opponents. Undoubtedly there would be variation in the results based upon the opponent that was played. Observing the trends over time as well as the differences between types of opponent would be valuable research moving forward.

CONCLUSION
The purpose of this research was to introduce a method that would be useful to American football strength & conditioning professionals who must group athletes based on their roles and expected performance during competition. The game of American football has changed, and position groups do not always fit the traditional mold (bigs, big-skills, and skills). K-means clustering can be used as means to group athletes based on the evolving changes of physical "game" demands for athletic training. With increasing wearable technology usage to objectively quantify biomechanical processes, this clustering technique can inform grouping decisions of strength & conditioning professionals who are required to train large groups of individuals thereby improving the training of athletes, and better preparing their bodies to handle the demands placed upon them during competition.
When comparing the results from the individual season analyses to the traditional groupings, there are enough similarities that the technique feels validated, while there are enough differences that the technique still feels useful. While using an aggregate measurement of games with similar competition level worked well for this study, further research can analyze the effect that individual games have on game demands and how players are grouped. Moving forward, specifically with validating this method as a tool for practitioners, this research could further validate the technique by interviewing strength & conditioning professionals whilst using the tool. This will provide added context from the practitioners themselves that this study was not able to add. Additionally, the more organizations' strength & conditioning teams that test the technique and confirm its usefulness, the more validated the method becomes in this context.