Abstract
To evaluate collective (macro) performance data in the Chinese Super League, social network analyses and graph theory were implemented in team sports performance analyses. For each team, we constructed a weighted and directed network in which nodes corresponded to players and arrows to passes. A total of 1200 matches during the 2014–2018 seasons were analysed. The results showed significant differences in general network measures among team competitive levels, match locations and match outcomes. Successful teams and home teams had significantly higher link, diameter, density and cluster coefficient values than did the unsuccessful teams or visiting teams; however, the winning teams had significantly lower density and cluster coefficient 2 values than did the losing teams. This study suggested that successful teams or teams with advantages (home, winner) have a high level of total passes and eigenvalues. This is the first report in which eigenvalues (the degree of a team’s distribution of the ball during the match) were able to capture changes in dynamics between different teams. There were positive correlations between all the network variables and the total shots and no correlations between all the network variables and goals.
Introduction
Team sports games depend on cooperation and interaction between teammates to defend against the opponent’s strengths and to exploit the opponent’s weaknesses. Therefore, based on these dynamics, it is possible to consider team sports as a cooperation–opposition game that depends on the players’ interactions. The passing network of a soccer team consists of the players as vertices and the passes between the players as edges. This creates a network that can then be studied for its unique characteristics. Studies on social network analysis have highlighted the benefits of decentralized cooperation for team performance and the successfulness of these tasks. 1 The main rationale behind such results is associated with the benefits of exploiting the individual advantages of each member and generating the best possible result that integrates the contribution of each member, thus optimizing the final result. 2 This rationale is in line with a meta-analysis that summarized these two main conclusions: 3 (i) networks with the highest density values lead to the best performances and (ii) networks with high centralized tendencies are associated with poor performances.
Social network research in team sports in general derives from two main approaches: macro and micro level. From the macro-level perspective, theoretically driven network methods are conducted to assess collective team sport behaviours with little or no regard to individual teammates, while from the micro-level perspective, specialized tools and metrics related to graph theory are applied to evaluate the structural and topological properties of the interpersonal interactions of teammates.
Marco-level research
In water polo, a network analysis of attacking units was performed to identify how the number of intra-team interactions emerges in a match, and the most successful types of intra-team interactions were characterized. 4 The main finding suggested that the most successful collective system behaviour required a high probability of each player interacting with the other team players. 4 A similar result was also found in a study that investigated the English Premier League. 1
Grund analysed the association between the network density, centralization and the number of goals in 76 matches involving 23 teams of the English Premier League. 1 His findings suggested that greater levels of network density led to a higher number of goals.
A study that analysed the 2014 FIFA World Cup and the Italian Serie A 2013/2014 season found that the performance of a team and network indicators are associated and may predict the outcomes of the games. 5 In a study that compared the national teams’ performances during the 2014 FIFA World Cup, there were significant differences between winning and losing teams in total links (greater values in winning teams) and density (greater values in winning teams). 6 Significant differences in the total links and density between the teams that reached the final and the teams that lost in round 16 were also found. The comparison of the performance variables with the network properties identified statistically positive correlations between the goals scored, shots and shots on goal and the total links, density and clustering coefficient. 6
These studies suggest that the density of the ties in a team’s instrumental social network is positively associated with team task performance and that large values of connectivity between teammates are associated with better overall team performance.
Micro-level research
One of the first studies using the network approach was performed to identify the best players in the European 2008 Championships. 7 In that study, the authors identified the attacking plays that resulted in shots on goal. Using a centrality approach for each analysed team, they found the player with the highest influence (central midfield player, Spanish team). Another study inspecting the attacking stage of basketball games was performed. 8 Their study aimed to identify the team coordination network and the team members’ interactions. The number of team members involved in each attacking play was considered in the study. It was found that for the majority of the time (42%), the unit of attack involved all five members of the team. 8
An analysis of variance revealed significant differences in prominence levels between tactical positions during the 2014 FIFA World Cup. 9 The results showed that midfielders had the highest values of out-degree, in-degree, closeness and betweenness centralities in the majority of tactical line-ups. Another study by Clemente showed that lateral defenders and midfield players were the players who mostly initiated the attacking plays that resulted in goals being scored. 10
However, Pina suggested that density but not the clustering coefficient or centralization was a significant predictor of the successfulness of offensive plays. He also found a negative relation between density and the successfulness of offensive plays. 11 In addition, Cho’s study demonstrated that network indicators can represent a soccer team’s performance and are suitable for win–lose prediction systems. 12
Moreover, Gama identified intra-team interactions by recording the total number of ball contacts achieved by each player and concluded that individual key players influence team performance considerably. 13 Recently, Castellano and Echeazarra analysed the relationship between network-based centrality measures and physical demands in elite football players. 14
One of the most important propositions in the macro-level literature is that the network density and intense interactions between individual members increase team performance. In accordance with this argument, the density–performance hypothesis was developed, which is that increased interaction intensity (such as a higher density, more links, larger clustering coefficients and more passes) leads to increased team performance.
Another hypothesis in the macro-level literature is that centralization – the degree to which network positions are unequally distributed in a team – is related to performance. Researchers do not always agree on this hypothesis in terms of the diameter (no relationship with team performance). Despite this discrepancy, the centralization–performance hypothesis was developed, which is that increased centralization of interactions (such as smaller eigenvalues) in teams leads to decreased team performance.
The most important hypothesis presented in the micro-level literature regarding soccer is that midfielders play the most important role in the passing network. In this paper, we proposed this hypothesis as the midfielder core hypothesis.
All the previous studies used a cross-sectional study design, and the absence of longitudinal analysis makes it difficult to determine whether the network or the hypothesized effects of the network are causally antecedent. Moreover, no studies have analysed variations in network properties among teams with different final ranks among leagues. Furthermore, as a fast-growing professional football league in Asia, examining the Chinese Super League is helpful in understanding the different social interactions of different cultures.
Therefore, the aim of this study was to analyse the general properties and centrality levels of the Chinese Super League over a series of seasons. The variance in general and centrality network levels were tested for different competitive levels, match locations, match outcomes and player positions. Moreover, analysis was conducted to demonstrate the relationship between network metrics and match performance variables.
Methods
The study was approved by the local institutional ethics committee, and we obtained permission for use of the CHAMPION Sports Information Technology Company.
Samples
The data used in this study were made available by CHAMPION Sports Information Technology Company (China). Our sample comprised 1200 matches (2400 adjacency matrix) played in the Chinese Super League 2014–2018 seasons. The variable used in this study to classify the network was the pass between teammates. The passing sequences and directions were codified in Figure 1 using a dedicated software program (uPATO). 15 An adjacency matrix was generated for each game observed. The matrix represents the connections between a player and the teammates. 4 The frequency of passes was also determined and registered in the matrix. Based on that, this study analysed weighted digraphs (a weighted digraph is a graph that is made up of a set of vertices connected by edges, where the edges have a direction associated with them and the graphs have weights assigned to their arrows).

Example of adjacency matrix per team. In the table, the row shows that player n performed a given number of passes for the remaining teammates. The column shows that player n received a given number of passes from their teammates.
Four competitive levels were also codified as high group A (top four), upper middle group B (ranks 5–8), lower middle group C (ranks 9–12), and low group D (bottom four). The locations were classified as home or away.
Network analysis
The 2400 overall adjacency matrices were generated based on the passes between teammates and then imported into R with the igraph package for analysis. Igraph is an open source application distributed according to the GNU General Public License and is free for non-commercial or commercial use. 16 Igraph allows a researcher to load formatted network data such as sociomatrices and analyse the social and mathematical properties of the corresponding social networks in the form of mathematical graphs. The application also computes basic graph properties, such as density, diameter and link, which were used in this study.
The network analysis of the five seasons in the Chinese Super League was focused on the following six measures based on the connections between teammates: (i) total links, (ii) density, (iii) diameter, (iv) cluster coefficient, (v) eigenvalue and (vi) total passes. Brief information is provided in Table 1.
Brief description of the variables.
Link
Link indicates the size of the graph (number of edges), equation (1).
17
Given one unweighted graph G with n vertices, the link index, L, of G is calculated as
Density
The density of a graph is the ratio of the number of edges to the number of possible edges, equation (2).
17
Given one weighted graph G with n vertices, the link index
where
Diameter
The diameter of a graph is the maximum distance (the length of the largest geodesic) between any two connected nodes and is computed by the formula diameter =
Clustering coefficient
Unweighted clustering coefficient (CR1)
In the case of directed graphs, the (local) clustering coefficient of a node is the proportion of links present between nodes directly connected to it (neighbourhood N of the node). Thus, the local clustering coefficient of each node i is computed as the fraction of the number of all arcs ajk between ki nodes in its neighbourhood divided by the maximum number ki (ki − 1) of links that could exist among them.
6
That is
This condition is applied for all j, k nodes in the neighbourhood N of i. Thus, the local clustering coefficient measures the degree of interconnectivity in the neighbourhood of a node. A high degree means that the node and its neighbours are close enough to become a clique. We used a variant of the global version of the clustering coefficient, which measures the level of clustering in the entire network. This variant is the network average of the local clustering coefficients
Weighted clustering coefficient (CR2)
Clemente and Grassi proposed a new local clustering coefficient for weighted and directed networks.
18
The numerator of the coefficient takes into account all directed triangles that a node i actually forms with its neighbours, weighted with the average weight of the links connecting a node i to its adjacent nodes j and k. The denominator is all the possible (appropriately weighted) directed triangles that it could form, equation (5). Formally, a directed graph (or digraph) D = (V, A) is a pair of sets V and A, where V is the set of n vertices (or nodes) and A is the ordered set of m pairs (arcs) of vertices of V; if (i, j) or (j, i)
Global clustering coefficient
Eigenvalue
The eigenvalue corresponds to the calculated eigenvector (the centrality scores), equation (6).
19
Given one weighted graph G with n vertices
Pass
The pass is the total number of successful passes between teammates.
All seven of the abovementioned metrics were computed per team in each game.
Statistical procedures
All analyses were executed in IBM® SPSS Statistics for Windows, version 20.0 (SPSS Inc., Chicago, IL, USA). The magnitude of the ES(R) for Wilcoxon signed ranks was classified as trivial (<0.1), small (>0.1–0.3), moderate (>0.3–0.5), large (>0.5–0.7) and very large (>0.7–1.0), based on guidelines from Batterham and Hopkins.
12
The magnitude of the ES (
Network visualization and computation of network statistics were conducted in igraph software package 1.2.4.1, 16 directed Clustering R package and R version 3.5.3 (R-Project; www.r-project.org).
Results
Seasonal analysis
Table 2 shows a statistically significant difference in all the network matrices (except diameter) across the five seasons. The 2018 season had significantly (adjusted p < 0.01) lower values of link, density and CR2 compared with the three seasons from 2014 to 2016. The 2018 season had significantly lower CR1 values than the 2014 (adjusted p < 0.01) and 2016 (adjusted p < 0.05) seasons. The 2017 and 2018 seasons had significantly lower pass values compared with the 2016 season (adjusted p < 0.05). The 2015 season had significantly lower eigenvalues than the 2016 season.
Descriptive statistics table (mean ± standard deviation and median) and statistical comparison between factors (Season).
Note: In the Kruskal-Wallis H test, A and asignify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in 2014 (adjusted P < 0.01 and P < 0.05, respectively); B and bshowed a significant difference in 2015; C and cshowed a significant difference in 2016; D and dshowed a significant difference in 2017; E and eshowed a significant difference in 2018. All the effect size is small.
League-ranking effect analysis
Table 3 revealed a statistically significant difference in link across the four different team groups (H (3, n = 2400) =82.57, p < 0.001, small effect size). Group A (the top 4 teams) recorded a higher median score (Md = 101) than the other three team groups, which recorded median values of 97, 96 and 95.
Descriptive table (mean ± standard deviation and median) and statistical comparison between factors (league rank).
Note: In the Kruskal–Wallis H test, A and asignify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in Group A (adjusted P < 0.01 and P < 0.05, respectively); B and bshowed a significant difference in Group B; C and cshowed a significant difference in Group C; D and dshowed a significant difference in Group D; A: Top 4 teams; B: Upper-middle 4 teams; C: Lower–middle 4 teams; D: Bottom 4 teams. All the effect size is small.
For the density, there was a statistically significant difference across the four different team groups (H (3, n = 2400) =52.185, p < 0.001, small effect size). Group A (the top 4 teams) recorded a higher median score (Md = 0.571) than the other three team groups, which recorded median values of 0.56, 0.553 and 0.549.
For the clustering coefficient, the network statistics with statistically significant differences between the groups of teams were the CR1 (H (3, n = 2400) =31.467, p < 0.001, small effect size) and CR2 (H (3, n = 2400) =142.352, p < 0.001, small effect size). In CR1, group A (the top 4 teams) recorded a higher median score (Md = 0.82) than the other three team groups, which recorded median values of 0.812, 0.809 and 0.805. In CR2, group A (the top 4 teams) recorded a higher median score (Md = 0.749) than the other three team groups, which recorded median values of 0.728, 0.72 and 0.717.
Eigenvalue was influenced by league-ranking (H (3, n = 2400) =133.243, p < 0.001, small effect size). Group A (the top 4 teams) recorded a higher median score (Md = 30.693) than the other three team groups, which recorded median values of 28.234, 25.662 and 25.177.
The Kruskal–Wallis test revealed a statistically significant difference in passes across the four different team groups (H (3, n = 2400) =52.185, p < 0.001, small effect size). Group A (the top 4 teams) had a higher median score (Md = 326) than the other three team groups, which had median values of 299, 265.5 and 270.
Match outcome effect analysis
Table 4 revealed a statistically significant difference in link across the three different outcome groups (H (2, n = 2400) =82.213, p < 0.001, small effect size). The loss group recorded a higher median score (Md = 101) than the other two outcome groups, which recorded median values of 95 (win) and 94 (draw).
Descriptive table (mean ± standard deviation and median) and statistical comparison between factors (match outcome).
Note: In the Kruskal–Wallis H test, W and wsignify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the Win Group (adjusted P < 0.01 and P < 0.05, respectively); D and dshowed a significant difference in the Draw Group; L and l showed a significant difference in the Loss Group. All the effect size is small.
The diameter was statistically significantly different across the three different outcome groups (H (2, n = 2400) =49.536, p < 0.001, small effect size).
The density was significantly different between the three different outcome groups (H (2, n = 2400) =82.213, p < 0.001, small effect size). The loss group recorded a higher median score (Md = 0.571) than the other two outcome groups, which recorded median values of 0.538 (win) and 0.555 (draw).
CR2 was influenced by match outcome (H (2, n = 2400) =15.598, p < 0.001, small effect size). The loss group recorded a higher median score (Md = 0.734) than the win group, which recorded a median value of 0.717. No differences across the three groups of match outcomes were found in CR1.
The Kruskal–Wallis test revealed a statistically significant difference in eigenvalue across the three different outcome groups (H (2, n = 2400) =20.818, p < 0.001, small effect size). The win group recorded a higher median score (Md = 28.733) than the other two outcome groups, which recorded median values of 26.901 (loss) and 26.81 (draw).
There was a statistically significant difference in pass across the three different outcome groups (H (2, n = 2400) = 8.793, p < 0.001, small effect size). The win group recorded a higher median score (Md = 298) than the draw group, which recorded a median value of 281.
Location effect analysis
A Wilcoxon signed-ranks test indicated that all the variables were statistically significantly higher in the home group than in the away group.
As shown in Table 5, link was statistically significantly greater for home than away teams (z = −3.981, p < 0.001) with a small effect size (r = 0.081). The median score on the link was larger for home (Md = 99) than for away (Md = 96). The Wilcoxon signed-ranks test revealed a statistically significantly larger diameter for home than away teams (z = −2.246, p = 0.025) with a small effect size (r = 0.046). A significantly higher density for home than away teams (z = −4.33, p < 0.001) with a small effect size (r = 0.088) was also revealed. The median density score was higher for home (Md = 0.566) than for away (Md = 0.549). Similarly, statistically significantly larger values for cluster coefficients 1 and 2 were found for the home teams than for the away teams, CR1 (z = −3.146, p < 0.01) and CR2 (z = −4.085, p < 0.001), with a small effect size (CR1: r = 0.064, CR2: r = 0.083). The median score of cluster coefficient 1 was higher for home (Md = 0.814) than for away (Md = 0.808), and the median score of cluster coefficient 2 was also higher for home (Md = 0.737) than for away (Md = 0.720).
Descriptive table (mean ± standard deviation and median) and statistical comparison between factors (home or away).
Note: All the effect size is small.
Finally, playing at home or away was statistically significant for explaining the results in eigenvalues (z = −5.88, p < 0.001, small effect size r = 0.12) and total passes (z = −6.071, p < 0.001, small effect size r = 0.124). The median eigenvalue score was higher for home (Md = 28.766) than for away (Md = 26.131), and the median pass score was higher for home (Md = 305) than for away (Md = 276).
Relationship analysis
The relationship between team performance variables (goals scored, overall shots, shots on goal, free kicks and corner kicks) and the characteristics of the network graphs (link, network density, clustering coefficient, diameter, eigenvalue and pass) was investigated using Spearman’s rank correlation coefficient. The values of the coefficients are shown in Table 6.
Correlation values between the team performance variables and the network values provided by the metrics.
Note: *Correlation is significant at p ≤ 0.050. **Correlation is significant at p ≤ 0.010. ***Correlation is significant at p ≤ 0.001.
Goals scored showed very small positive correlations with graph diameter (ρ = 0.105; p < 0.001), eigenvalue (ρ = 0.093; p < 0.001) and pass (ρ = 0.063; p < 0.001). A very small negative correlation existed between goals scored and density (ρ = −0.083; p < 0.001). There was no evidence to indicate that a high level of goals scored was associated with high values for clustering coefficient (CR1: ρ = 0.010; p > 0.050, CR2: ρ = −0.022; p > 0.050) or link (ρ = −0.039; p > 0.050).
Overall, shots had small positive correlations with link (ρ = 0.278; p < 0.001), network density (ρ = 0.268; p < 0.001), clustering coefficient (CR1: ρ = 0.188; p < 0.001; CR2: ρ = 0.318; p < 0.001), eigenvalue (ρ = 0.376; p < 0.001) and pass (ρ = 0.383; p< 0.001). There was a very small positive correlation between shots and graph diameter (ρ = 0.121; p < 0.001). High overall shots were associated with high link, network density, clustering coefficient, eigenvalue and pass values.
Shots on goal had very small positive correlations with link (ρ = 0.125; p < 0.001), graph diameter (ρ = 0.099; p < 0.001), network density (ρ = 0.089; p < 0.001) and clustering coefficient (CR1: ρ = 0.092; p < 0.001, CR2: ρ = 0.139; p < 0.001). Furthermore, it had small positive correlations with eigenvalue (ρ = 0.22; p < 0.001) and pass (ρ = 0.215; p < 0.001). High shots on goal were associated with high eigenvalue and pass values.
Corner kicks had very small positive correlations with link (ρ = 0.196; p < 0.001), graph diameter (ρ = 0.054; p < 0.001) and network density (ρ = 0.187; p< 0.001). In addition, it had small positive correlations with clustering coefficients (CR1: ρ = 0.117; p < 0.001, CR2: ρ = 0.205; p < 0.001), eigenvalue (ρ = 0.253; p < 0.001) and pass (ρ = 0.261; p < 0.001). High corner kicks were associated with high clustering coefficient, eigenvalue and pass values.
Free kicks had no clear correlation with any of the seven network variables.
Player position analysis
Table 7 shows a statistically significant difference in eigenvalues across the six different position groups (H (5, n = 33021) = 5007.45, p < 0.001, large effect size). The Central Midfielder group recorded a significantly higher median score (Md = 17.10) than the groups of the other five positions (External Defender, External Midfielder, Forward, Central Defender and Goalkeeper), which recorded median values of 15.05, 13.85, 13.27, 10.52 and 2.66, respectively.
Descriptive table (mean ± standard deviation and median) and statistical comparison between player positions (N = 33,021).
Note: In the Kruskal–Wallis H test, A and a signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the Goalkeeper group (adjusted P < 0.01 and P < 0.05, respectively); B and b signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the External Defender group (adjusted P < 0.01 and P < 0.05, respectively); C and c signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the Central Defender group (adjusted P < 0.01 and P < 0.05, respectively); D and d signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the External Defender group (adjusted P < 0.01 and P < 0.05, respectively); E and e signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the Central Defender group (adjusted P < 0.01 and P < 0.05, respectively); F and f signify that the Bonferroni-adjusted Mann–Whitney U test showed a significant difference in the Forward group (adjusted P < 0.01 and P < 0.05, respectively).
Discussion
Our findings provide compelling support for the view that passing networks have important effects on performance in association football. The key finding of this study is that the analysis of total passes supports the density–performance hypothesis, and the analysis of eigenvalues supports the centralization–performance hypothesis, while the analysis with links, density and cluster coefficients yields inconsistent results regarding the validation of the density–performance hypothesis.
The research suggests that teams with some kind of advantage (a successful team with good players or home advantage) have significantly higher values than inferior teams (unsuccessful teams or those with away disadvantages) for the links, density, cluster coefficients, number of successful passes (supporting the density-performance hypothesis) and eigenvalues (supporting the centralization–performance hypothesis).
Recent scientific literature suggests that successful teams have the highest levels of network density, total links and clustering coefficients. Thus, the ability to increase the connection between all teammates may result in excellent overall team performance in a tournament. 6 Top teams also presented more ball touches, passes and pass accuracy than bottom teams during the Spanish national championship ‘La Liga BBVA’ 2012–2013. 21
Additionally, we found that winning teams had more successful passes than losing and drawing teams. Our result is supported by previous studies, such as a higher frequency of short passes reported for winning teams in comparison to drawing and losing teams during the EURO 2016, 22 and the total number of passes was positively associated with team performance. 11 Based on match outcomes, these results reinforce the idea that a more decentralized offensive process, based on passing progression and on a supporting progression strategy, could lead teams to be more successful during football matches.
The location of the match and the increase in general player cooperation properties in home matches could be explained by greater identification and self-assurance/confidence when the athletes play in an environment to which they are already accustomed compared to matches played away from home. 23
The overwhelming advantage of the top four teams in total passes is in agreement with the findings obtained by Buldu and his colleagues, 24 who suggested that the number of passes made by F.C. Barcelona is much higher than those of their rivals.
It should be mentioned that all effect sizes show that the parameters investigated in this study have only a minor impact on the soccer match performance of the Chinese Super League, which is in line with a previous study. 6
However, we demonstrated that losing teams have significantly higher values than do winning teams for links, density and clustering coefficient 2 (the result is exactly the opposite of the density–performance hypothesis), and there are no significant differences in clustering coefficient 1 among the three outcomes (not supporting the density-performance hypothesis). In accordance with the present results, previous studies have demonstrated a negative relation between density and successfulness of offensive plays in the UEFA Champions League 2015/2016 and the 2016 European Football Championships.11,25 Our results support previous findings about centralized interaction patterns leading to decreased team performance in the English Premier League. 1
It should be mentioned that our results are inconsistent with those reported in Clement’s related research, which reported higher values of density and clustering coefficient 1 for winning teams during the 2014 FIFA World Cup. 6 The disaccord between this previous study and the present findings could be due to unstable measurement account for network metrics such as density and clustering coefficient 1, or may alternatively be attributable to differences in the match essence (championship and league) and sample size (64 and 1200).
Based on our results, we hypothesized that winning teams would adopt a more defensive style to maintain their favourable score. Therefore, losing teams would need a higher frequency of passes to build a better opportunity to find a scoring box. Our observation is consistent with previous reports that the clustering coefficient is not a significant predictor of team performance.11,26,27 Pina indicated that different offensive styles can be equally effective for a team to succeed. 11
Pina showed that low network density is associated with a higher overall number of offensive plays, and high density was associated with fewer and/or longer offensive plays, which reduces the possibilities of a team moving into the finishing zone. 11 Previous studies corroborate our findings, showing that a higher cluster coefficient and network density are associated with worse team performance. Further investigation is needed to clarify this issue.
This study also analysed the association between several regular performance variables, such as goals scored, shots, shots on goal, free kicks and corner kicks. A previous study that analysed 760 matches in the English Premier League identified that the number of goals scored is strongly associated with the highest levels of network density. 1 In another study, significant positive correlations were found between the total links and network density with the number of goals scored, shots, and shots on goal. 6 In our study, goals scored were not strongly associated with the network density and cluster coefficients. Our results could not support the density–performance hypothesis. It is important to emphasize, however, that there are differences between the competitions analysed regarding the quality of the opponents and the respective competitions in general, which could influence these results. This would be a pertinent starting point for future studies of this subject.
In the present study, total passes and eigenvalues are significant predictors of team performance, team competitive level and home advantage. These two macro variables have the ability to capture changes in team dynamics. This finding makes direct comparisons of networks in team sports more meaningful.
In the player position analysis, the results revealed that the highest levels of eigenvector associated with the eigenvalues were found in the central midfielders, and the central midfielder group had significantly higher centrality quantities compared with the other four positions. This result is consistent with previous studies on football,7,9 which suggested that the central midfielder position significantly contributed to the build of attack in football. Thus, our study supports the midfielders’ core hypothesis.
The eigenvector allows an evaluation of the player’s role in the passing network. A player with a very high eigenvector is a player that interacts with several important teammates, thus suggesting a central role. A player with a low eigenvector can be considered a peripheral teammate that interacts with a few and not very central players. The eigenvector centrality metric expresses the importance of the node by observing its connections to other important nodes. The centrality quantities are the elements of the eigenvector associated with the eigenvalue. Currently, centrality in football does not mean that the player with the highest metric is the best player or even the most effective player. It does, however, give an idea of which players are most important to the team’s distribution of the ball during the match. It is very possible that such players also correspond to the ‘best’ players, but that is not necessarily true, and it is important to be aware of that.
Another important contribution of this article is that the dynamic time-series characteristics of network structure were explored. This article suggested that the Chinese Super League experienced a significant decline in links, cluster coefficient 2, density and passes across the five seasons. Unfortunately, due to the lack of relevant literature, we cannot explain the above downward trend from a technical or tactical perspective. From an economic perspective, the mean first-team pay for each club per season has increased over the past four seasons, while its standard deviation has decreased (2016 season 28 : 588,449 ± 540,427; 2017 season 29 : 788,239 ± 560,000; 2018 season 30 : 802,798 ± 472,779; 2019 season 31 : 963,844 ± 482,985, units: GBP [Great Britain Pound]). Due to the salary determination and the pay–performance relationship in professional soccer, 32 the increase in wage levels and the decrease in salary differences indicate that the higher and closer the levels of the athletes are, the more intense the competition, which will affect the quantity and quality of the network structure.
Finally, the passing networks that we have presented provide an attractive visual summary or ‘snapshot’ of a football team’s style. The obvious limitation of these networks is that they are static. However, as we have seen, they can be complemented with the computation of centrality measures that provide useful information about the importance and connectedness of individual players, which might benefit coaches, sports journalists and league supporters.
Conclusion
This study examined the effects of match-related factors and team characteristics on network metrics in professional football leagues. The results underline that teams change macrostructures according to team status (i.e. successful or unsuccessful) and match status (i.e. losing, tie, winning or home/away). High total passes and eigenvalues are associated with good team performance. Furthermore, the Chinese Super League had a significant downward trend in link, cluster coefficient 2, density and pass across the five seasons.
Footnotes
Declaration of conflicting interests
The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Fundamental Research Funds for the Central Universities in China (grant number 2017XZA217).
