The Public Sector Premium in Returns to Education and the Allocation of Human Capital in China

Abstract

This study examines the education return premium in China’s public sector and its role in driving labor allocation. Using China General Social Survey (CGSS) data from 2003 to 2021, we employ propensity score matching (PSM) and Oaxaca-Blinder (OB) decomposition to address selection bias. Results show a significant education return premium (Δβ = .5014) in the public sector despite an overall income discount (ATT = −0.324). Heterogeneity analysis reveals stronger premiums in SOEs versus core public sectors (government/institutions). Macro-temporal analysis confirms a positive correlation between education premium and public employment share. Robustness checks, including Rosenbaum bounds, CIA tests, and placebo tests (fake sector), support the findings. Policy implications suggest optimizing public sector compensation to leverage education premiums and enhancing private sector benefits to attract talent.

Plain Language Summary

Why do many highly educated workers in China choose jobs in the public sector, even though average salaries there can be lower than in private firms? This study finds the answer is not in the base salary, but in the ‘education premium’—the extra income gained from each additional year of schooling. Our analysis of national survey data from 2003 to 2021 shows that the public sector rewards education more highly than the private sector does. This higher return on education, along with benefits like job security, makes public sector careers attractive to educated individuals. This talent flow, however, may impact economic innovation and growth. Therefore, we suggest that policymakers optimize the public sector’s compensation structure to better utilize its human capital, while enhancing the benefits package in the private sector to better compete for talent.

Keywords

public sector returns to education sectoral choice human capital allocation China propensity score matching

Introduction and Literature Review

The National civil service examination, known as “China’s first exam,” has experienced 30 consecutive years of heating up since China established a civil service examination and recruitment system in 1994. According to data from the website of Zhonggong Education, the number of applicants for the National exam in 2024 will be 2,913,800, and the number will be 1.437 million in 2020, more than doubling in just 4 years. Moreover, in addition to the large number of the “army” of public examination, there are even some young people with high degrees, according to the report on the employment quality of graduates of Peking University in 2021, 14.86% of the undergraduates graduating from Peking University in 2021 went to work in the public administration, social security and social organization industries (According to the classification standard of National economic Industries of the People’s Republic of China, public administration, social security and social organization industries mainly include: ① organs of the Communist Party of China; ② State institutions; ③ The CPPCC and the democratic parties; ④ Social security; ⑤ Mass organizations, social organizations and other member organizations; ⑥ Grassroots self-governing organizations, industries and other categories. It is roughly equivalent to the public sector, so hereinafter referred to as the public sector industry), and among master’s graduates, the proportion of employment in the public sector industry is as high as 19.87%, and the proportion of doctoral graduates in the public sector industry is lower, but also 9.19%. According to the report on the quality of employment of graduates from Tsinghua University in 2022, the situation is similar, and among graduates from Tsinghua University in 2022, the proportion of bachelor’s, master’s and doctoral degrees going to the public sector industry is 10%, 17.1%, and 17.1% respectively. Given China’s already limited stock of human capital, the rush of so many bright minds into the public sector is bound to have all sorts of adverse effects on China’s economic development. At present, a large number of studies have analyzed the adverse economic and social consequences of this phenomenon of talent rushing into the public sector in China. These studies specifically encompass adverse impacts on economic growth, innovation, and other areas. On economic growth, for instance, S. Li and Yin (2017), based on empirical analysis of China’s prefecture-level data, demonstrated that the cream of the crop disproportionately flowed into governmental sectors, which deprived the private sector of vital human capital and ultimately hampered economic growth. Additionally, J. Li and Nan (2019) reinforced this thesis, evidencing that excessive allocation of human capital to the public sector is detrimental to sustained economic expansion, particularly by dampening productivity gains in innovation-intensive industries. In contrast, Zhu and Wei (2024) revealed from a policy perspective that corrective measures for human capital misallocation in the public sector would not only elevate the steady-state economic growth rate but also generate substantial welfare improvements across the economy—effectively constituting a Pareto improvement. In terms of innovation, Y. Chen and Xu (2019) demonstrate that excessive talent allocation to government sectors significantly suppresses regional innovation efficiency. This detrimental effect on innovation is further corroborated by Tan and Li (2019), X. Z. Li et al. (2022), and Z. Sun (2023), whose analyses consistently reveal human capital overcrowding in public institutions stifles innovation outputs. Finally, in other aspects, these studies specifically document adverse impacts of human capital misallocation on: Exports (W. Jiang et al., 2019; S. Y. Li, 2022), Income inequality (Y. A. Chen, 2020; Y. A. Chen & Xu, 2019), and Consumption (J. Li & Si, 2020).

A comprehensive review of the extant literature reveals that excessive accumulation of human capital within the public sector could potentially impede broader socio-economic progress through various adverse mechanisms. This raises a pivotal question: How can policymakers effectively guide human capital allocation across sectors? Addressing this challenge requires a comprehensive understanding of the underlying causes of human capital misallocation to formulate targeted solutions. Therefore, identifying the root causes of human capital misallocation between public and private sectors becomes imperative.

Baumol (1990) proposed that the compensation structure embedded in social institutions determines human capital allocation. Specifically, when a wage premium exists in the public sector, the labor force tends to rationally favor public sector employment. Cavalcanti and Santos (2021) further underscore the critical role of public-private sector wage differentials in shaping human capital allocation and economic growth. Their counterfactual simulation using Brazilian data reveals that reducing the public sector wage premium from 19% to 15% would yield a 4.7% increase in long-term output. A critical question arises: Does such a public sector wage premium indeed exist? The empirical literature reveals mixed findings. Most studies in developed economies support its existence. For instance, Smith (1976) conducted the seminal analysis of public-private wage differentials using U.S. Census Public Use Samples from 1960 and 1970. His study found that federal employees earned significantly more than private-sector counterparts, with 65% of the gap attributable to discriminatory factors (e.g., non-merit-based hiring). Similarly, Gunderson (1979) analyzed 1971 Canadian Census data and identified a public sector wage premium, particularly pronounced among women and low-income workers. Krueger (1988) reinforced this consensus through multi-dataset analyses of U.S. federal, state, and local government wages, consistently finding public sector wages exceeding private sector levels. In a recent study, Sławińska (2021) employs the Oaxaca-Blinder decomposition technique to analyze public-private sector wage differentials across 26 EU nations during 2008 to 2013. The findings reveal that public-sector employees secured significantly higher compensation than their private-sector counterparts in most countries, with structural wage gaps persisting amid fiscal consolidation. However, contrasting evidence emerges from Pederson et al. (1990), whose analysis of Danish data (1976–1985) indicates lower public sector wages relative to the private sector.

Research on developing countries yields even more divergent conclusions. For instance, Nielsen and Rosholm (2001) analyzed Zambian cross-sectional data (1991, 1993, 1996) and identified a consistent public sector wage premium across all income quantiles. In contrast, Stelcner et al. (1989) applied a switching regression model to Peru’s 1985–1986 labor market data, revealing significantly lower average wages in the public sector compared to the private sector—a disparity that incentivized public sector workers to engage in part-time private employment. Similarly, Adamchik and Bedi (2000) employed a switching regression framework in Poland and found a private sector wage advantage, particularly pronounced among university-educated workers. Botchway and Asiedu (2020) demonstrate through Oaxaca-Blinder (OB) and Jones-Mendola-Pezzotti (JMP) decompositions that private-sector employees in Ghana command a 23% wage premium relative to state-owned enterprise (SOE) workers, revealing market-driven compensation advantages in the formal private economy. Imbert (2013) documented phased fluctuations in Vietnam’s public-private income gap, suggesting temporal heterogeneity in sectoral wage differentials. In a seminal recent study, El-Haddad and Gadallah (2021) empirically validate the existence of institutional wage premiums within Egypt’s public sector through rigorous application of the RIF-FFL decomposition methodology. Cross-country analyses further complicate the picture: Panizza et al. (2001) examined panel data from 17 Latin American countries (1980–1998), estimating an initial 14% public sector premium that dropped to 4% after excluding informal sector workers. In a comparable vein, Gindling et al. (2020) demonstrate through a cross-national analysis of 68 countries that while the public sector typically pays a wage premium when compared to the private sector including informal segments, this premium dissipates entirely when benchmarked solely against the formal private sector—highlighting how informal labor markets distort conventional sectoral wage comparisons.

Studies on China’s public-private sector income differentials yield inconsistent conclusions, which can be categorized into three groups based on opposing findings.

(a) Studies Identifying a Public Sector Premium

J. W. Zhang and Xue (2008) employed the Heckman two-step method using PSFD (Panel Study of Family Dynamics) data to analyze wage disparities between state-owned and non-state-owned sectors. Their analysis revealed a significant wage premium in state-owned enterprises (SOEs), with average wages twice those of the non-state sector. Notably, over 80% of this gap was attributed to human capital differences. Extending this work, the authors applied the Heckman selection model to 2002 CHIPS (China Household Income Project Survey) data, controlling for labor force participation and sector selection biases. Both OLS and Heckman estimates confirmed a persistent public sector wage premium, with the gap widening and discriminatory factors (e.g., institutional barriers) explaining a higher proportion when selection biases were accounted for. Contrastingly, Zhao (2002) analyzed 1996 urban household survey data and found private sector wages exceeding SOE wages in raw comparisons. However, after incorporating non-wage benefits (e.g., housing subsidies, pensions), total compensation in SOEs surpassed private firms significantly.

(b) Studies Reporting a Public Sector Disadvantage

Liu and Zhang (2020) leveraged CFPS2010 (China Family Panel Studies) data with quantile regression, demonstrating that wages in public institutions, SOEs, and private enterprises consistently exceeded those in the narrow public sector (i.e., government agencies) across all income quantiles. Similarly, Yao et al. (2016) conducted an Oaxaca-Blinder decomposition on CHIP2008 data, revealing higher wage returns in the private sector compared to the public sector. However, when regional disparities were adjusted, the public sector exhibited higher income returns—a paradox suggesting spatial heterogeneity in compensation structures.

Research on China’s public-private sector remuneration differentials reveals distinct temporal patterns. Yin and Gan (2009) analyzed CHNS (China Health and Nutrition Survey) data spanning 1989 to 2009, documenting a structural shift: public sector wages trailed non-public sector wages by 2.9% during 1989 to 1997, but surpassed them by 13.48% in 2000 to 2006. Building on this, Y. B. Zhang (2012) replicated the phased hypothesis with identical methodologies, finding post-2000 heterogeneity across income quantiles—public sector advantages persisted in lower quantiles, while non-public sectors dominated in higher quantiles. X. J. Chen and Yu (2019) employed Recentered Influence Function (RIF) unconditional quantile regression on CHNS microdata to track the state versus non-state sector wage gap. Their decomposition analysis captured an evolutionary trajectory: state sector wages transitioned from a discount (pre-reform) to a premium (post-2000s), with the primary drivers shifting from institutional factors (e.g., SOE monopoly privileges) to human capital determinants (e.g., education returns) in recent decades. This aligns with K. Z. Jiang et al. (2012), whose analysis of 1992 to 2008 data identified a reversal: non-public sector wage superiority during 1992 to 1996 gave way to public sector dominance post-1999. Notably, Zhou and Wang (2013) conducted a longitudinal analysis of CHNS survey data (1989–2009) using unconditional quantile regression, uncovering a V-shaped trajectory in the state sector premium—an initial narrowing followed by post-2000 widening, attributable to labor market liberalization and SOE restructuring policies. Finally, Tian and Shen (2022) leverage CHIP 2013/2018 data and Copula-based empirical methods to reveal a striking temporal shift: public-sector workers enjoyed a statistically significant wage premium in 2013, yet faced a distinct pay penalty by 2018. This reversal challenges conventional assumptions about public-sector compensation persistence. Complementarily, heterogeneity analyses confirm sectoral wage gaps dynamically interact with gender and education. As Wan et al. (2021) demonstrate using endogenous switching models applied to CHIP 2013 urban survey data: Men earn higher wages in the non-public sector than in public institutions, Women conversely achieve income advantages within the public sector—highlighting how institutional pay structures asymmetrically shape gender outcomes in China’s segmented labor market.

Research Gaps and This Study's Contributions

The literature review reveals three critical gaps: (1) inconsistent empirical conclusions requiring rigorous re-examination, (2) narrow conceptualization of compensation limited to wage differentials, and (3) outdated datasets constraining policy relevance. To address these limitations, and while building on the existing literature, our study provides the most comprehensive temporal analysis using CGSS 2003–2021, focuses on returns to education as a novel mechanism, and combines multiple methods to address selection bias and distributional effects. This offers updated insights for China’s post-2013 reform era.

Rather than claiming entirely novel concepts, this study offers substantive incremental contributions by refining, integrating, and updating existing research paradigms in the following ways:

First, regarding the analysis of sectoral income differentials: While prior studies (e.g., Yao et al., 2016; J. W. Zhang & Xue, 2008) have established the foundation, their conclusions are inconsistent and often rely on static or dated snapshots. Our contribution lies not in discovering the gap itself, but in providing a robust, longitudinal re-examination that tracks the evolution of this gap across critical policy periods (post-WTO, post-2008 crisis, post-2013 reforms) using the most recent nationally representative data (CGSS 2003–2021). This allows us to move beyond the question of “if a gap exists” to “how the gap has dynamically changed” in China’s contemporary reform context, offering much-needed updated evidence.

Second, regarding the conceptualization of compensation and the mechanism: We acknowledge that analyzing education returns and broader compensation is not unprecedented. Our incremental contribution is twofold:

Mechanism Integration: We pioneer the direct integration of sectoral income differentials with labor mobility decisions to empirically test how compensation gaps (including education returns) actually distort human capital allocation—a theoretical link often discussed but underexplored empirically in transitional economies like China.

Conceptual Clarification and Measurement Honesty: We explicitly conceptualize “employment value” as encompassing both monetary and non-monetary benefits (welfare, job security, etc.) to better reflect real-world career choices. We acknowledge that due to data limitations, our empirical measurement of total income is currently limited to monetary wages. However, by formally establishing this comprehensive conceptual framework, we provide a clearer blueprint for future research when better data becomes available, and we rigorously analyze the component we can measure (cash income) within this broader theoretical context.

Third, we move beyond the mere existence of a sectoral income gap to probe its underlying drivers, with a particular focus on the returns to education as a pivotal mechanism. While prior literature has documented the average wage differential, our contribution is to empirically test whether and how the premium for educational attainment differs between the public and private sectors, and how this disparity contributes to human capital misallocation. This mechanistic focus is especially relevant for China’s college-educated cohort—a group critical for innovation-led growth—although our analysis encompasses the entire labor market. Decomposing the variations in the education premium between sectors provides novel insights into the root causes of skill-occupation mismatches.

Finally, regarding methodology: Our primary methodological contribution is not in using any single technique in isolation, as OB decomposition or quantile regression have been used before. It is in the deliberate combination of PSM, OB decomposition, and quantile regression within a unified framework to tackle different aspects of the problem. This multi-method approach is specifically designed to address selection bias (via PSM), decompose the gap’s sources (via OB), and unveil its distributional heterogeneity (via Quantile Regression) in a more comprehensive and robust manner than most prior studies that typically employ only one or two of these methods.

Research Objectives and Empirical Strategy

The primary objectives of this paper are threefold: (1) to re-examine the public-private income differentials using the latest CGSS data; (2) to analyze the returns to education as a key mechanism driving sectoral choice; and (3) to address the methodological challenges of selection bias and heterogeneous distributional effects.

Crucially, each objective necessitates a specific econometric technique, and their combined use is central to our research design—not merely for robustness checks. The rationale for this multi-method approach is as follows:

To mitigate selection bias (Objective 3) arising from the non-random sorting of individuals into sectors, we employ Propensity Score Matching (PSM). We selected PSM over alternatives like the Heckman model due to its semi-parametric nature and flexibility in handling non-linear relationships without stringent distributional assumptions (W. Sun et al., 2016). This method allows us to construct a valid counterfactual for a causal interpretation of the income gap.

To deconstruct the matched income gap (Objective 1) and test the mechanism of returns to education (Objective 2), we apply the Oaxaca-Blinder (OB) decomposition. This technique is necessary to quantify what portion of the gap is due to differences in observable characteristics (e.g., a more educated workforce) versus differences in the returns to those characteristics (e.g., a higher premium for the same education).

To fully capture distributional heterogeneity (Objective 3) obscured by mean analysis, we implement Unconditional Quantile Regression (UQR). This method is objectively essential to determine if the income gap and returns to education are consistent across the entire earnings distribution, providing nuanced insights for policy.

We anticipate that this rigorous, multi-pronged empirical strategy will yield robust results showing a significant public sector compensation premium in contemporary China, primarily driven by higher returns to education and non-wage benefits, and that this premium varies substantially across the income distribution.

Paper Structure

The remainder of this paper is structured as follows: Section 2 establishes a theoretical model and proposes research hypotheses. Section 3 details the data sources and variable definitions. Section 4 presents the empirical results conducted through the aforementioned techniques. Section 5 concludes with policy implications.

Theoretical Framework and Research Hypotheses

Theoretical Framework

To elucidate the core mechanism of human capital allocation, we develop a parsimonious model grounded in random utility theory (McFadden, 1974). An individual i chooses to work in the sector that delivers the highest utility. The utility from working in sector j (public or private) is:

U_{ij} = β_{j} * Ed u_{i} + ε_{ij}

(1)

where $β_{j}$ is the sector-specific return to education (the education premium), and $ε_{ij}$ captures all unobserved individual-specific tastes for sector j.

The individual selects the public sector if $U_{i, pub} > U_{i, pri}$ . This choice probability can be expressed as:

P_{i, pub} = P [(β_{pub} - β_{pri}) * Ed u_{i} > ε_{i, pri} - ε_{i, pub}]

(2)

Let $Δ β = β_{pub} - β_{pri}$ denote the relative education premium in the public sector, the key parameter of interest. Let $η_{i} = ε_{i, pri} - ε_{i, pub}$ . Assuming $ε_{i, pub}$ and $ε_{i, pri}$ are independent and identically distributed following the Type I Extreme Value distribution, the difference $η_{i}$ follows a logistic distribution (McFadden, 1974). This distributional assumption is essential, as it yields a closed-form solution for the choice probability:

P_{i, pub} = Λ (Δ β * Ed u_{i})

(3)

where $Λ (\cdot)$ is the logistic cumulative distribution function.

Testable Hypotheses

From Equation 3, we derive our core testable hypotheses at both the micro and macro levels.

H1 (Micro-level choice): Ceteris paribus, an increase in the public sector’s relative education premium $(Δ β)$ raises an individual’s probability of selecting into the public sector. Furthermore, this effect is positively moderated by the individual’s education level $(Ed u_{i})$ .

This micro-foundation allows for aggregation to the macro level. The share of the workforce employed in the public sector, S_pub, is the expectation of individual choice probabilities:

S_{pub} = E [Λ (Δ β * Ed u_{i})]

(4)

Equation 4 establishes the critical link between a micro parameter and a macro outcome, leading to our second hypothesis.

H2 (Macro-level equilibrium): The aggregate employment share of the public sector (S_pub) is a positive function of the relative education premium ( $Δ β$ ).

Empirical Roadmap

To empirically test the hypotheses derived from our theoretical model, our analysis in Section 4 proceeds through the following structured sequence:

Testing H1: We estimate a Logit model of sectoral choice to assess the effect of education on the probability of selecting into the public sector, conditional on observables.

Estimating the Premium: We employ Propensity Score Matching (PSM) to address selection bias, followed by Oaxaca-Blinder (OB) decomposition to quantify the public sector education premium (Δβ) at the mean.

Examining Heterogeneity: We use Unconditional Quantile Regression (UQR) to explore variation across the income distribution and conduct additional analysis using a narrow public sector definition (excluding SOEs) to test for sectoral heterogeneity.

Testing H2: We analyze the co-movement between the annual education premium (Δβ_t) and the public sector’s employment share (S_pub,t) using scatter plots and regression analysis.

Robustness Checks: We perform comprehensive tests to validate our findings, including:

Rosenbaum bounds sensitivity analysis for hidden bias in PSM.

CIA tests using pre-treatment variables (e.g., gender, parental education, parental employment sector).

Placebo tests with fake sector definitions.

OB decomposition with alternative benchmarks (e.g., using private-sector coefficients as the reference) to assess sensitivity to baseline choices.

Additional controls for occupation and firm size, with standard errors clustered at the province level.

This multi-method approach ensures each step is tied to our theoretical framework, while robustness checks enhance the credibility of our results.

Data and Sample, Measurement of Variables, and Econometric Framework

Model Design

In studying the income disparity between the public and private sectors, a critical issue that must be addressed is sample selection bias. Specifically, since individuals are not randomly assigned to either sector but actively choose between public and private employment, it is possible that certain unobserved factors simultaneously influence both their sector choice and their income level. This can lead to biased estimates in the analysis.

The primary methodologies in the literature to correct for this sample selection bias include the Heckman sample selection model, the endogenous switching regression model, and propensity score matching (PSM). While all three approaches aim to address selectivity bias, the former two suffer from significant limitations compared to PSM. First, they inherently impose strong parametric assumptions, primarily relying on linear functional forms for the outcome equations. Second, these parametric models may rely on imposing restrictions on non-comparable samples (W. Sun et al., 2016). In contrast, propensity score matching largely overcomes these drawbacks. Consequently, the subsequent empirical analysis in this paper will employ PSM to investigate the income gap between China’s public and private sectors.

Propensity score matching (PSM), initially developed by Rosenbaum and Rubin (1983), is a semi-parametric estimation method. Its approach to estimating the sectoral income differential involves first estimating a propensity score based on observed covariates. This score represents an individual’s probability of being employed in the public sector. Specifically, a logit model is used to estimate this probability:

\Pr (Publi c_{i} = 1 | X_{i}) = \frac{\exp (β X_{i})}{1 + \exp (β X_{i})}

(5)

Where:

$Publi c_{i} = (0, 1)$ is a binary variable indicating employment sector (1 = Public Sector, 0 = Private Sector).

$X_{i}$ is the vector of observed covariates for individual i.

β is the vector of corresponding coefficients.

Using the estimated propensity scores, the method then matches each individual in the “treatment” group (public sector employees) with one or more individuals in the “control” group (private sector employees) who have a very similar propensity score. By pairing individuals with comparable background characteristics and pre-treatment tendencies, PSM mitigates the influence of confounding factors and allows for a more accurate estimation of the average treatment effect on the treated (ATT), effectively isolating the impact of sector affiliation on income.

Data Sources and Preprocessing

This study utilizes data from the Chinese General Social Survey (CGSS), the earliest nationwide academic survey project in China, initiated by Renmin University of China in 2003. The dataset spans 12 survey years (2003, 2005, 2006, 2008, 2010, 2011, 2012, 2013, 2015, 2017, 2018, and 2021) and contains comprehensive individual-level information, providing robust support for analyzing the differences in education returns between public and private sectors and labor force employment choices.

To ensure the reliability of the findings, the analysis focuses exclusively on working-age individuals: males aged 16 to 60 and females aged 16 to 55. Samples outside these age ranges, individuals currently enrolled in school, those without employment, and cases with missing key variables were excluded. The final dataset comprises 42,091 valid samples, including 15,556 from the public sector and 26,535 from the non-public sector.

Variable Definitions

This section details the variables employed in the regression analysis, including their definitions and measurement.

Dependent Variable

Annual Income Level (logincome): Based on the design of the CGSS questionnaire, this variable utilizes the annual income reported by individual respondents. The nominal income figures are adjusted for inflation using the Consumer Price Index (CPI) to express them in constant 2021 prices. The resulting real income value is then transformed using the natural logarithm. It should be noted that, While CGSS data captures self-reported total annual income, which is intended to reflect comprehensive compensation, the precise inclusion and accurate valuation of all non-wage benefits (e.g., employer-provided housing, health insurance) by respondents may present a measurement challenge.

Treatment Variable

Sector of Employment (public): This binary variable distinguishes between the public sector (public = 1) and the non-public sector (public = 0).

The classification of sectors aligns predominantly with the established practice in the literature: government agencies (dangzheng jiguan) and public institutions (shiye danwei) constitute the public sector, while privately-owned enterprises (siying qiye) are classified as non-public. The main ambiguity lies in the treatment of State-Owned Enterprises (SOEs). Some studies categorize SOEs within the non-public sector, arguing they operate as profit-maximizing entities. Others contend that SOEs in China fulfill unique social responsibilities beyond pure profit maximization and should therefore be classified as part of the public sector.

To address this ambiguity, the baseline regression adopts the latter approach, grouping SOEs with the public sector (public = 1). This classification is justified by the institutional realities in China, where SOEs often operate under quasi-governmental roles, fulfill social and political mandates beyond pure profit maximization, and are subject to direct state oversight and personnel management similar to core public institutions (X. J. Chen & Yu, 2019; Zhou & Wang, 2013). Subsequently, a robustness check reclassifies SOEs as non-public (public = 0) to verify that the core findings are not sensitive to this specific definitional choice.

Control Variables

(1) Gender (gender): Binary variable coded as 1 for Male, 0 for Female.

(2) Household Registration(hukou): Binary variable coded as 1 for Urban Registration (chengzhen hukou), 0 for Rural Registration (nongcun hukou).

(3) Marital Status (marriage): Binary variable coded as 1 for Married, 0 for other statuses (including single, divorced, widowed, etc.).

(4) Educational Attainment (education): Reflecting standard methodology in the literature, this continuous variable represents the total number of years of formal education, assigned as follows: No Formal Education = 0; Primary School = 6; Junior Secondary School = 9; Senior Secondary School/Vocational High School/Technical Secondary School (zhongzhuan) = 12; College Diploma (dazhuan) = 15; Bachelor’s Degree (benke)= 16; Graduate Degree or above (yanjiusheng ji yishang) = 19

(5) Potential Work Experience (experience) & its Square (experience2): Constructed following the conventional formula: Experience = Age-Years of Education (education)−6. To capture potential non-linear effects of experience on income, the squared term (experience^2) is also included.

(6) Political Affiliation (party): Binary variable coded as 1 for members of the Chinese Communist Party (CCP), 0 for non-members or members of other parties.

(7) Parental Employment Sector (fpublic, mpublic): To account for documented patterns of intergenerational transmission within public sector employment (Han et al., 2016), the sector of employment for each parent is included. fpublic indicates the father’s employment sector (1 = Public, 0 = Non-public). mpublic indicates the mother’s employment sector (1 = Public, 0 = Non-public).

Descriptive statistics for all variables are presented in Table 1.

Table 1.

Descriptive Statistics of Each Variable.

Main variable	Public sector	Non-public sector	T-test for differences between groups
Logincome	9.92 (1.06)	9.95 (1.13)	−0.03**
Gender	0.58 (0.49)	0.57(0.49)	0.01
Hukou	0.83 (0.37)	0.46 (0.5)	0.37***
Marriage	0.85 (0.36)	0.8 (0.4)	0.05***
Education	12.63 (3.29)	10.37 (3.48)	2.26***
Experience	22.39 (11.48)	21.94 (11.74)	0.45***
Party	0.27 (0.44)	0.06 (0.24)	0.21***
Fpublic	0.5 (0.5)	0.22 (0.42)	0.28***
Mpublic	0.29 (0.45)	0.12 (0.33)	0.17***
Observations	15,556	26,535

Note. The standard deviation is in parentheses.

“***” and “**” respectively represent that the t value is significant at the levels of 1%, and 5%.

Based on the descriptive statistics presented in Table 1, several preliminary conclusions can be drawn:

Sectoral Income Gap: Annual total income (logincome) appears slightly higher in the non-public sector compared to the public sector, though the difference is not substantial.

Marital Status Composition: The proportion of married individuals is significantly higher within the public sector. This may reflect a stronger preference among married individuals for job stability, a characteristic often associated with public sector employment.

Urban Hukou and CCP Membership: Individuals in the public sector exhibit a notably higher prevalence of both urban household registration (hukou) and membership in the Chinese Communist Party (party). This likely stems from a two-way relationship: urban registration and CCP affiliation may facilitate entry into the public sector, while conversely, public sector employment may also increase an individual’s chances of obtaining urban registration and/or party membership.

Human Capital Endowment: Individuals employed in the public sector demonstrate stronger human capital attributes. Both average years of education (education) and work experience (experience) are higher in the public sector compared to the non-public sector. This pattern aligns with the observed “public sector job application surge” (kao gong re) among highly-educated individuals in China in recent years. Furthermore, the lower job turnover in the public sector allows employees greater opportunity to accumulate more extensive experience within a specific role.

Intergenerational Transmission: Evidence of significant intergenerational transmission is present concerning access to public sector positions.

However, it is crucial to emphasize that these findings are preliminary and derived solely from descriptive analysis of the raw data. More precise causal inferences require the subsequent econometric testing presented below.

In addition, the expected influence of the control variables differs between the two primary stages of our analysis: the selection into the public sector (estimated via the logit model for propensity scores) and the determination of income (the outcome equation). Table 2 summarizes the theoretical expectations for each variable’s coefficient in both models, along with the primary rationale. These expectations are derived from standard labor economics theory and the findings of prior studies on the Chinese labor market.

Table 2.

Theoretical Expectations for Variable Coefficients in Sector Choice and Income Determination Models.

Variable	Expected sign in sector choice (Dependent: Public = 1)	Theoretical rationale (for sector choice)	Expected sign in income equation (dependent: logincome)	Theoretical rationale (for income)
Gender (1 = Male)	Ambiguous (?) or Negative (−)	No strong prior. Cultural norms might push men toward higher-risk, higher-reward private sector (Adamchik & Bedi, 2000). Alternatively, no significant difference.	Positive (+)	Robust finding of a gender wage gap, where men earn more than women, ceteris paribus (e.g., Wan et al., 2021).
Hukou (1 = Urban)	Positive (+)	Urban registration is often a prerequisite or strong advantage for accessing public sector jobs in China (Liu & Zhang, 2020).	Positive (+)	Urban workers typically have better access to education, networks, and formal employment, leading to higher incomes.
Marriage (1 = Married)	Positive (+)	Married individuals may exhibit greater risk aversion and thus prefer the stability and benefits (e.g., family health insurance) of public sector employment.	Positive (+)	The “marriage premium” is a common finding, attributed to specialization within households or employer perception of stability.
Education (Years)	Positive (+)	The public sector in China highly values formal educational credentials for hiring and promotion, making it more attractive to the highly educated (Y. B. Zhang, 2012).	Positive (+)	A cornerstone of human capital theory: more education increases productivity and is rewarded with higher wages (Mincer, 1974).
Experience (years)	Positive (+)	Work experience is a valued attribute in both sectors.	Positive (+)	Represents accumulation of skills and on-the-job training.
Experience2	Negative (−)	If the positive effect of experience on public sector selection diminishes at higher levels.	Negative (−)	Standard to capture the concavity of earnings profiles; returns to experience eventually diminish.
Party (1 =CCP Member)	Positive (+)	Party membership is a key channel for recruitment into China’s public sector and is often a de facto requirement for advancement (Zhou & Wang, 2013).	Positive (+)	Party membership may signal loyalty or provide access to rent-seeking opportunities and networks, boosting income.
Fpublic(1 = Father in Public)	Positive (+)	Reflects intergenerational transmission of public sector employment and access to networks (Han et al., 2016).	——	——
Mpublic(1 = Mother in Public)	Positive (+)	Same rationale as fpublic.	——	——

Finally, to provide a more intuitive visualization of the income disparity between China’s public and private sectors and its evolution over time, we employ kernel density plots to illustrate the income distributions for each sector across all 12 years covered in the CGSS data. The results are presented in Figure 1.

Figure 1.

Kernel density plot of the logarithm of total revenue from 2003 to 2021.

Observations from Figure 1: Over the entire 12-year period, the average level of total income (logincome) for employees in the public sector consistently exceeds that of their counterparts in the private sector. Crucially, the income distribution within the private sector exhibits significantly fatter tails at both the lower and upper ends compared to the public sector. This indicates greater income dispersion (i.e., larger income inequality) among private sector workers. In contrast, the income distribution within the public sector appears more compressed, signifying a narrower range of incomes and consequently, lower income inequality within that sector.

Empirical Results and Related Tests

Empirical Strategy: A Roadmap

To test the hypotheses derived from the theoretical framework (Section 2), we employ a multi-stage empirical strategy designed to address methodological challenges—selection bias, decomposition of income differentials, and distributional heterogeneity—while providing a comprehensive view of the mechanisms. This approach ensures each step aligns with distinct research objectives, and the combination of techniques is necessary for robustness and depth.

Our analysis proceeds in the following sequence:

Model 1: Testing the Micro-Choice Mechanism (H1)

We estimate a Logit model of sectoral choice to examine whether higher education levels increase the probability of selecting into the public sector, conditional on observables (e.g., gender, Hukou, Party membership). This provides a direct test of H1.

Models 2–3: Estimating the Education Return Premium (Δβ) and Addressing Selection Bias

To robustly estimate Δβ (the public sector’s education return premium), we address selection bias via a two-step procedure:

Propensity Score Matching (PSM): We use PSM (one-to-one nearest-neighbor matching) to create a balanced sample where public and private sector workers are comparable based on pre-treatment covariates (e.g., education, experience, demographic factors).

Oaxaca-Blinder (OB) Decomposition: On the matched sample, we apply OB decomposition to decompose the mean income gap into explained (characteristics) and unexplained (coefficients) components, isolating Δβ.

Model 4: Examining Heterogeneity

We employ Unconditional Quantile Regression (UQR) to assess whether Δβ varies across the income distribution (e.g., low, median, high quantiles). Additionally, we test sectoral heterogeneity by re-running analyses with a narrow public sector definition (excluding SOEs) to ensure results are not driven by SOE inclusion.

Model 5: Temporal Analysis for Macro-Implication (H2)

We test H2 by analyzing the co-movement between the annual education premium ( $β_{t}$ ) and the public sector’s employment share ( $S_{pub, t}$ ) over time using scatter plots and regression analysis, providing macro-level validation.

Model 6: Robustness Checks

To ensure validity, we conduct extensive robustness tests:

Rosenbaum Bounds Sensitivity Analysis: To assess hidden bias in PSM estimates.

CIA Tests: Using pre-treatment variables only (e.g., parental education, childhood Hukou) to verify the conditional independence assumption.

Placebo Tests: Employing fake sector definitions to rule out spurious correlations.

OB Decomposition with Alternative Benchmarks: Using private-sector coefficients as the reference to test sensitivity to baseline choices.

Additional Controls: Including occupation and firm size variables in OB decompositions, with standard errors clustered at the province level for conservative inference.

This sequential approach ensures that each empirical model serves a distinct purpose, from micro-level choice to macro-level trends, while robustness checks enhance the credibility of our findings.

Propensity Score Estimation Results

The estimation results of the propensity score using the logit model are shown in the Table 3.

Table 3.

Estimation Results of Propensity Score by Logit Model.

Variable	Coefficient	Variable	Coefficient
Gender	−0.118*** (0.025)	Experience2	0.0006*** (0.0001)
Hukou	1.141***(0.03)	Party	1.144*** (0.038)
Marriage	0.21***(0.038)	Fpublic	0.624*** (0.029)
Education	0.192***(0.005)	Mpublic	0.091*** (0.035)
Experience	0.008* (0.005)	Constant term	−4.51*** (0.104)
Observations		42,091
Pseudo R²		.2149

Note. The errors in parentheses are standard errors.

“***” and “*” respectively represent that the t value is significant at the levels of 1%, and 10%.

The results from the logit model, which estimates the propensity for public sector employment, are presented in Table 3. The findings are largely consistent with theoretical expectations and the established literature on China’s labor market, while also revealing one significant and insightful deviation.

As anticipated, educational attainment exhibits a strong, positive, and statistically significant effect on the probability of public sector employment. This result provides direct empirical support for the hypothesis H1 of our theoretical model and aligns perfectly with the observed “civil service craze” among China’s highly educated workforce. Furthermore, the results confirm the powerful role of institutional and socio-demographic factors:

Urban hukou and CPC membership are paramount positive predictors, reflecting the well-documented institutional barriers and advantages in accessing public sector careers. The significant, positive coefficients for parental public sector employment (fpublic, mpublic) provide strong evidence of intergenerational transmission of occupational access.

The negative coefficient for gender suggests a stronger preference for public sector stability among women compared to men, while the positive sign for marriage indicates that married individuals also favor public sector employment.

However, the results for work experience present a nuanced deviation from the standard assumption of diminishing returns. Contrary to the initial conditional expectation, the coefficient for the squared term of experience (experience2) is positive and significant. This indicates that the marginal effect of additional experience on the probability of public sector employment increases over time. A compelling explanation for this finding lies in the unique seniority-based reward architecture of China’s public sector, where promotions, pension benefits, and job security are explicitly tied to tenure. This creates a powerful accelerating incentive and a “lock-in” effect, making the sector disproportionately more attractive as workers accumulate more experience.

Collectively, these results confirm that selection into China’s public sector is strongly non-random and is systematically driven by human capital, institutional status, and family background. The model’s strong explanatory power (Pseudo R² of .2149) affirms its appropriateness for generating reliable propensity scores for the subsequent matching analysis.

Premium Situation of Public Sector Revenue

After estimating the propensity score of the individuals, the individuals in the treatment group and the control group are matched based on this score, and the average treatment effect, that is, the income premium of the public sector, is calculated. The specific matching methods of propensity score matching mainly include one-to-one matching, radius matching, kernel matching, etc. In this part, in order to reflect the changes of the income premium of the public sector over time. In addition to calculating the income disparity between the public and private sectors of the entire sample, we also calculated the income disparity between sectors in each year. The matching method used was one-to-one matching. The specific results are shown in the following Table 4. (It should be noted that, due to the small sample size in 2011, it might not be possible to find individuals in the control group with propensity scores similar to those in the treatment group. Therefore, the result reported by stata software is “no observations.” The subsequent table involving the empirical results of 2011 is the same as this)

Table 4.

ATT Values Obtained by the One-to-One Propensity Score Matching Method.

Year	Sample size of the public sector	Non-public sector sample size	Public sector revenue premium (ATT)	Standard error	“T value”
2003	2,237	912	0.057	0.057	1
2005	2,108	2,436	−0.049	0.048	−1.02
2006	1941	2,611	−0.068	0.045	−1.52
2008	1,185	1,715	−0.017	0.066	−0.25
2010	1,159	2,439	−0.138**	0.058	−2.38
2011	559	1,343	——	——	——
2012	1,215	2,819	−0.179***	0.051	−3.5
2013	1,178	2,744	−0.191***	0.045	−4.24
2015	991	2,273	−0.081	0.059	−1.38
2017	1,169	2,888	−0.218***	0.05	−4.32
2018	1,134	2,816	−0.133**	0.053	−2.5
2021	680	1,539	−0.189**	0.08	−2.37
Full sample	15,556	26,535	−0.324***	0.018	−18.06

“***” and “**” respectively represent that the t value is significant at the levels of 1%, and 5%.

As shown in the above table, after correcting the sample selection bias using the propensity score matching method, the income of the non-public sector was significantly higher than that of the public sector. The income of the public sector was 32.4% lower than that of the non-public sector. Moreover, in each year, except for 2003 when the income of the public sector was higher but not significant, the income of the public sector was relatively low in the remaining years. Especially since 2017, the revenue of the public sector has consistently been significantly lower than that of the non-public sector.

In order to verify the reliability of the conclusion, a balance test of the matching effect needs to be conducted. The test results are shown in the following Figure 2:

Figure 2.

Balance test.

It can be seen from the figure that after matching, the standardized deviations of all variables have decreased, which indicates that the balance test has been passed. Through matching, we have obtained comparable treatment groups and control groups.

Furthermore, to ensure the quality of our matching procedure, we tested the common support assumption. The results of this test are presented in Figure 3.

Figure 3.

Test of the common support assumption.

Figure 3 shows that for most of the propensity score distribution, both the treatment and control groups lie within the region of common support. The substantial overlap indicates that the common support assumption is satisfied.

Decomposition of Income Disparity

The following of this article will decompose income differences through the Oaxaca-Blinder decomposition method (referred to as OB decomposition hereinafter) to analyze the sources of income differences between the public and private sectors. The idea of the OB decomposition method is to decompose income differences into coefficient differences and endowment differences by constructing a counterfacutal group. Among them, Endowment differences refer to the income differences caused by the different productivity of the labor force in the two sectors. This part of the difference is called the explainable part. Coefficient differences refer to the differences caused by the different rates of return on homogeneous labor in the two sectors. This part is called the unexplainable part or the discriminatory part.

Specifically, regarding the research topic of this article, we take the public sector as the benchmark group, construct the counterfactual group as “non-public sector labor force regarded as working in the public sector,” and express the income they obtain as $W^{C}$ , that is, the income that employees currently working in the non-public sector could obtain if they worked in the public sector.

Therefore, the income disparity between the public and private sectors can be expressed as:

\begin{array}{l} W_{i}^{p u b} - W_{i}^{p r i} & = (W_{i}^{p u b} - W_{i}^{C}) + (W_{i}^{C} - W_{i}^{p r i}) \\ = β^{pub} (X_{i}^{p u b} - X_{i}^{p r i}) + X_{i}^{p r i} (β^{pub} - β^{pri}) \end{array}

(6)

Among them, $β^{pub} (X_{i}^{pub} - X_{i}^{pri})$ represents the explainable endowment difference part, and $X_{i}^{pri} (β^{pub} - β^{pri})$ represents the unexplainable coefficient difference part.

The specific regression results are presented in Table 5. The Oaxaca-Blinder decomposition yields insights critical to our theoretical framework. Under our income difference definition ( $Δ Y = Y_{private} - Y_{public}$ ), the negative endowment effect indicates that public sector workers possess superior observable characteristics (e.g., higher average education and experience), which, all else equal, should grant them a higher income than their private-sector counterparts. However, the observed raw income differential is negative (indicating lower public sector earnings), which is entirely attributable to the large, positive coefficient effect.

Table 5.

Decomposition Results of OB.

Total difference	The average logarithmic annual income of the public sector	9.9183*** (0.0085)
	The mean logarithmic annual income of the non-public sector	9.9527*** (0.0069)
	Logarithmic income disparity (non-public sector income − public sector income)	0.0343*** (0.011)
	Endowment difference	−0.2551*** (0.0088)
	Coefficient difference	0.2894*** (0.012)
Education	Endowment difference	−0.3331*** (0.0074)
Education	Coefficient difference	−0.5014*** (0.0476)

Note. The errors in parentheses are standard errors.

“***” respectively represent that the t value is significant at the levels of 1%.

Specifically, regarding the returns of educational variables, as shown in the lower panel of Table 5 are significantly higher in the public sector. It is important to note that in our decomposition setup ( $Δ Y = Y_{private} - Y_{public}$ ), the negative sign of the coefficient difference for education (−0.5014) indicates that the return to education is greater in the public sector, which is the benchmark group. The magnitude of this negative value thus confirms a substantial education premium in the public sector.

Heterogeneity Analysis: Sectoral and Distributional Dimensions

This subsection examines heterogeneity in the education return premium across sectors and income distribution. We first analyze sectoral differences using a narrow public sector definition (excluding SOEs) to isolate the core government effect, followed by distributional analysis using quantile regression on the broad definition.

Sectoral Heterogeneity: Narrow Public Sector Versus Private Sector

We employ OB decomposition on the matched sample from propensity score matching (PSM) to isolate sectoral effects. The OB results (Table 6) reveal that the narrow public sector exhibits an overall income premium (total gap = −0.237, indicating higher income compared to the private sector). Decomposition shows this premium is primarily driven by endowment differences (explained component = −0.311), indicating that public sector workers possess superior observable characteristics (e.g., higher education levels). The coefficient effect (unexplained component = 0.074) is positive but small, suggesting that structural returns to characteristics play a minimal role, with private sector returns slightly higher if characteristics were equal.

Table 6.

The OB Decomposition Results of the Narrow Public Sector.

Total difference	The average logarithmic annual income of the public sector	10.131*** (0.011)
	The mean logarithmic annual income of the non-public sector	9.894*** (0.006)
	Logarithmic income disparity (non-public sector income − public sector income)	−0.237*** (0.012)
	Endowment difference	−0.311*** (0.009)
	Coefficient difference	0.074*** (0.013)
Education	Endowment difference	−0.416*** (0.008)
Education	Coefficient difference	−0.236*** (0.059)

Note. The errors in parentheses are standard errors.

“***”, “**”, and “*” respectively indicate that the t value is significant at the levels of 1%, 5%, and 10%.

For the education variable specifically, the OB decomposition (Table 6, lower part) shows a negative coefficient effect (−0.416) and a negative endowment effect (−0.236), indicating that both higher educational endowment and higher returns to education in the public sector contribute to the premium, with the endowment effect being larger in magnitude. This implies that the income advantage stems mainly from better educational characteristics of public sector employees, supplemented by slightly higher returns to education.

Distributional Heterogeneity: Income Quantile Analysis

We next use unconditional quantile regression (UQR) with the broad public sector definition to explore income distribution effects. Results (Table 7) show the education premium is positive and significant across quantiles, confirming that the public sector offers higher returns to education regardless of income level.

Table 7.

Quantile Regression Results of the Income Equation of the Public and Private Sectors From 2003 to 2021.

Variable	0.1 quantile		0.25 quantile		0.5 quantile		0.75 quantile		0.9 quantile
Variable	Public sector	Non-public sector	Public sector	Non-public sector	Public sector	Non-public sector	Public sector	Non-public sector	Public sector	Non-public sector
Education	0.2103*** (0.005)	0.1635*** (0.004)	0.2019*** (0.004)	0.1665*** (0.004)	0.1942*** (0.003)	0.1637*** (0.003)	0.1893*** (0.004)	0.1429*** (0.003)	0.1713*** (0.005)	0.1259*** (0.004)
Experience	0.0055 (0.004)	0.0201*** (0.004)	0.0125*** (0.003)	0.0239*** (0.003)	0.0188*** (0.003)	0.0275*** (0.003)	0.0149*** (0.003)	0.024*** (0.003)	0.0154** (0.004)	0.0278*** (0.004)
Experience2	0.0001 (0.0001)	−0.0004*** (0.0001)	0.00001 (0.0001)	−0.0004*** (0.0001)	−0.0001** (0.0001)	−0.0004*** (0.0001)	−0.00004 (0.0001)	−0.0004*** (0.0001)	−0.0001 (0.0001)	−0.0005*** (0.0001)
Hukou	0.2326*** (0.032)	0.1802*** (0.022)	0.391*** (0.026)	0.2754*** (0.021)	0.4872*** (0.023)	0.3057*** (0.017)	0.4644*** (0.025)	0.2276*** (0.018)	0.3439*** (0.032)	0.1206*** (0.022)
Marriage	0.031 (0.037)	−0.0128 (0.03)	−0.027 (0.03)	−0.0789*** (0.028)	−0.069*** (0.026)	−0.0386 (0.024)	−0.1258*** (0.029)	0.034 (0.024)	−0.1652*** (0.036)	−0.048 (0.03)
Party	0.0975*** (0.029)	−0.0228 (0.042)	0.0494** (0.023)	−0.0761* (0.04)	0.0384* (0.021)	−0.0567* (0.033)	0.007* (0.023)	0.064* (0.034)	0.0473* (0.029)	0.1405*** (0.042)
Gender	0.1675*** (0.024)	0.298*** (0.021)	0.2118*** (0.02)	0.2925*** (0.019)	0.2436*** (0.018)	0.3464*** (0.016)	0.2574*** (0.019)	0.3565*** (0.017)	0.2397*** (0.024)	0.3294*** (0.02)
Pseudo R²	.1580	.0843	.1624	.0899	.1802	.1047	.1688	.1109	.1561	.1120
Observations	15,556	26,535	15,556	26,535	15,556	26,535	15,556	26,535	15,556	26,535

Note. The errors in parentheses are standard errors.

“***”, “**”, “*” represent the t value under the level of 1%, 5%, and 10%.

Macro-Level Implications (H2)

Building on the micro-foundations established earlier—where the public sector offers a higher return to education despite an income discount—this section tests the macro-level hypothesis H2: that the relative education premium (Δβ) is a key driver of the public sector’s share of educated employment ( $S_{pub, t}$ ).

To test H2, we first computed the annual education return premium ( $Δ β_{t}$ ) and the public sector employment share ( $S_{pub, t}$ ) for each survey year. We then estimated a linear regression model to examine their relationship:

S_{pub, t} = α + λ Δ β_{t} + ε_{t}

(7)

where λ captures the effect of the education premium on employment share.

The scatter plot of these variables (Figure 4) reveal a clear positive correlation, suggesting that years with a higher education premium are associated with a greater proportion of individuals selecting into the public sector.

Figure 4.

Scatter plot of education return rate premium and public employment share.

Regression results confirm this relationship: the coefficient on the education premium is positive and statistically significant (γ = 0.048, p < .05), providing strong support for H2. This indicates that the education return premium, as a core micro-mechanism, systematically influences the aggregate allocation of human capital in the economy (Table 8).

Table 8.

Regression Results of Equation 7.

Variable	Coefficient
Education	0.048** (0.019)
Constant term	0.18*** (0.007)

Note. Standard error in parentheses.

“***” and “**” indicate statistical significance at the 1% and 5% levels, respectively.

This macro-level evidence completes our argument: the “craze” for public sector jobs among the educated is not merely a sociological phenomenon but is rationalized by persistent economic incentives—specifically, the higher valuation of human capital in the public sector. The findings underscore that policy shifts affecting the education premium could have meaningful impacts on labor market dynamics and human capital distribution.

Robustness Checks

To ensure the robustness of our core finding—a positive education return premium in the public sector—we subject it to a battery of tests. These checks address potential methodological biases, including selection on unobservables, model specification, and clustering issues. The results consistently support the premium’s validity, as summarized below.

Alternative Matching Algorithms and Hidden Bias

We first test the sensitivity of our propensity score matching (PSM) results to different matching techniques. Using kernel and radius matching instead of nearest-neighbor matching, the estimated premium remains stable and significant (Table 9). To assess hidden bias, we apply Rosenbaum bounds sensitivity analysis (Table 10). The premium remains statistically significant at the 1% level even under substantial unobserved confounding (Γ = 2.5), indicating robustness to omitted variables.

Table 9.

ATT Values Obtained by Other Matching Methods.

Year	Matching method	Sample size of the public sector	Non-public sector sample size	Public Sector Revenue Premium (ATT)	Standard error	“T value”
2003	Radius matching	2,237	912	0.016	0.046	0.36
2003	Kernel matching	2,237	912	0.032	0.045	0.7
2005	Radius matching	2,108	2,436	−0.088**	0.037	−2.4
2005	Kernel matching	2,108	2,436	−0.085**	0.035	−2.39
2006	Radius matching	1941	2,611	−0.066*	0.039	−1.7
2006	Kernel matching	1941	2,611	−0.067*	0.038	−1.77
2008	Radius matching	1,185	1,715	−0.03	0.052	−0.57
2008	Kernel matching	1,185	1,715	−0.034	0.051	−0.66
2010	Radius matching	1,159	2,439	−0.116***	0.044	−2.62
2010	Kernel matching	1,159	2,439	−0.09**	0.043	−2.12
2011	Radius matching	559	1,343	——	——	——
2011	Kernel matching	559	1,343	——	——	——
2012	Radius matching	1,215	2,819	−0.141***	0.04	−3.56
2012	Kernel matching	1,215	2,819	−0.133***	0.038	−3.52
2013	Radius matching	1,178	2,744	−0.21***	0.037	−5.73
2013	Kernel matching	1,178	2,744	−0.209***	0.036	−5.88
2015	Radius matching	991	2,273	−0.117**	0.048	−2.42
2015	Kernel matching	991	2,273	−0.151***	0.044	−3.4
2017	Radius matching	1,169	2,888	−0.205***	0.041	−5.02
2017	Kernel matching	1,169	2,888	−0.209***	0.04	−5.27
2018	Radius matching	1,134	2,816	−0.14***	0.04	−3.46
2018	Kernel matching	1,134	2,816	−0.125***	0.04	−3.14
2021	Radius matching	680	1,539	−0.175***	0.064	−2.73
2021	Kernel matching	680	1,539	−0.123**	0.059	−2.1
Full sample	Radius matching	15,556	26,535	−0.336***	0.015	−22.61
Full sample	Kernel matching	15,556	26,535	−0.332***	0.015	−22.58

“***”, “**”, and “*” respectively represent passing the significance test at the 1%, 5%, and 10% levels.

Table 10.

Rosenbaum Bounds Sensitivity Analysis for the Propensity Score Model.

Г	Upper bound point estimate	Lower bound point estimate	Upper bound confidence interval	Lower bound confidence interval
1	5	5	5	5
1.5	5	5	−3.5e-07	5
2	−3.5e-07	5	−3.5e-07	5
2.5	−3.5e-07	5	−3.5e-07	5

Conditional Independence Assumption (CIA) Test

We test the CIA by using only pre-treatment variables (gender, parental employment sector, parental education) for matching and OB decomposition. The balance test confirms covariate balance after matching (Figure 5). The OB decomposition results (Table 11) show that the education premium remains positive and significant, supporting the CIA. We report only the OB decomposition table for brevity, as the Logit and PSM results are intermediate steps.

Figure 5.

The balance test of the CIA test.

Table 11.

OB Decomposition With Pre-treatment Variables.

Total difference	The average logarithmic annual income of the public sector	9.937*** (0.009)
	The mean logarithmic annual income of the non-public sector	10.008*** (0.007)
	Logarithmic income disparity (non-public sector income − public sector income)	0.071*** (0.012)
	Endowment difference	−0.226*** (0.009)
	Coefficient difference	0.297*** (0.013)
Education	Endowment difference	−0.319*** (0.008)
Education	Coefficient difference	−0.529*** (0.05)

Note. The errors in parentheses are standard errors.

“***” indicates statistical significance at the 1% levels.

Additional Controls and Clustering Adjustments

We include occupation and firm size controls to account for job characteristics and organizational scale (firm size is measured by the number of employees). This addresses potential omitted variable bias, as these factors may influence both sector choice and income. Standard errors are clustered at the province level due to missing primary sampling unit (PSU) identifiers in some survey years, which could lead to understated errors if clustered at the individual level. Province-level clustering provides a more conservative estimate. The OB decomposition results remain robust (Table 12).

Table 12.

OB Decomposition With Occupation and Firm Size Controls.

Total difference	The average logarithmic annual income of the public sector	9.92*** (0.073)
	The mean logarithmic annual income of the non-public sector	9.96*** (0.076)
	Logarithmic income disparity(non-public sector income − public sector income)	0.041 (0.033)
	Endowment difference	−0.242*** (0.041)
	Coefficient difference	0.283*** (0.036)
Education	Endowment difference	−0.299*** (0.038)
Education	Coefficient difference	−0.549*** (0.102)

Note. The errors in parentheses are standard errors.

“***” indicates statistical significance at the 1% levels.

OB Decomposition With Alternative Benchmarks

The OB decomposition results can be sensitive to the choice of benchmark coefficients. In our baseline, we used the public sector as the reference. To test sensitivity, we re-run the decomposition using the private sector as the benchmark (Tables 13 and 14). The education premium remains significant and qualitatively unchanged, confirming that our results are not driven by benchmark selection.

Table 13.

“OB Decomposition With Private-Sector Benchmark” for Broad Definition.

Total difference	The average logarithmic annual income of the public sector	9.918*** (0.008)
	The mean logarithmic annual income of the non-public sector	9.953*** (0.007)
	Logarithmic income disparity (non-public sector income − public sector income)	0.034*** (0.011)
	Endowment difference	−0.296*** (0.01)
	Coefficient difference	0.33*** (0.013)
Education	Endowment difference	−0.423*** (0.009)
Education	Coefficient difference	−0.412*** (0.039)

Note. The errors in parentheses are standard errors.

“***” indicates statistical significance at the 1% levels.

Table 14.

“OB Decomposition With Private-Sector Benchmark” for Narrow Definition.

Total difference	The average logarithmic annual income of the public sector	10.131*** (0.011)
	The mean logarithmic annual income of the non-public sector	9.894*** (0.006)
	Logarithmic income disparity (non-public sector income − public sector income)	−0.237*** (0.012)
	Endowment difference	−0.398*** (0.011)
	Coefficient difference	0.161*** (0.015)
Education	Endowment difference	−0.464*** (0.013)
Education	Coefficient difference	−0.188*** (0.047)

Note. The errors in parentheses are standard errors.

“***” indicates statistical significance at the 1% levels.

Placebo Test With Fake Sector Definition

To rule out spurious correlations, we conduct a placebo test by randomly assigning 30% of the sample to a fake “public sector” (treatment group) and re-running the PSM and OB decomposition. If our model is valid, the fake treatment should yield an insignificant premium. The results (Table 15) show no significant education premium, supporting the authenticity of our core finding. We report only the OB decomposition table for simplicity.

Table 15.

Placebo Test With Fake Sector Definition.

Total difference	The average logarithmic annual income of the public sector	9.951*** (0.01)
	The mean logarithmic annual income of the non-public sector	9.935*** (0.006)
	Logarithmic income disparity (non-public sector income − public sector income)	−0.016 (0.012)
	Endowment difference	−0.009* (0.005)
	Coefficient difference	−0.007 (0.01)
Education	Endowment difference	−0.015*** (0.006)
Education	Coefficient difference	−0.061 (0.045)

Note. The errors in parentheses are standard errors.

“***” and “*” indicate statistical significance at the 1% and 10% levels.

Overall, these tests provide strong evidence that the education return premium is a robust and persistent feature of China’s public sector labor market, resilient to alternative specifications, hidden bias, and placebo interventions.

Conclusions and Implications

This study re-examined the public-private sector compensation differential in China, moving beyond the debate on average wages to focus on the returns to education as a pivotal mechanism driving human capital allocation. Using CGSS data from 2003 to 2021 and a robust multi-method empirical strategy, we established a consistent finding: despite an overall income discount, China’s public sector offers a significant and robust education return premium. This premium rationalizes the sectoral choices of highly educated workers and is positively correlated with the public sector’s share of educated employment at the macro level.

Summary of Findings in Relation to Research Objectives

Our findings can be summarized against the three primary objectives of this paper:

Re-examination of Income Differentials: After correcting for selection bias, we find a persistent income discount (ATT = −0.324) for the broad public sector (including SOEs) compared to the private sector over the study period. However, kernel density plots reveal a more compressed income distribution within the public sector, indicating lower inequality.

Analysis of Education Returns as a Key Mechanism: The OB decomposition reveals that the income gap is primarily driven by a large, positive coefficient effect, underscoring structural differences in how sectors reward characteristics. Crucially, the education variable shows a significant negative coefficient difference (−0.5014), unambiguously indicating a higher marginal return to each year of education in the public sector—the central mechanism we hypothesized.

Addressing Methodological Challenges: The findings from PSM, OB decomposition, and Quantile Regression are mutually reinforcing and robust to an extensive battery of checks, including Rosenbaum bounds, CIA tests, placebo tests, and alternative model specifications, ensuring the credibility of our conclusions.

Policy Implications

Our results offer nuanced implications for policymakers:

For Public Sector Reform: Rather than across-the-board wage increases, which may be fiscally challenging, policies should focus on optimizing the existing education premium. This can be achieved by strengthening performance-based advancement systems and explicitly linking promotions and skill-based pay to educational attainment, thereby enhancing the efficiency of human capital utilization within the public sector.

For Private Sector Development: To compete for top talent, private enterprises need to address their relative disadvantage. Strategies should include developing comprehensive compensation packages that enhance social security, pension benefits, and job stability, effectively closing the non-wage benefit gap with the public sector.

Limitations and Future Research

Acknowledging the limitations of our study, particularly the potential measurement error in capturing non-wage benefits through self-reported income data, opens avenues for future research. Subsequent studies could utilize administrative data to more precisely quantify total compensation and explore regional and industrial heterogeneities in the education premium.

Footnotes

ORCID iD

Yanda Hang

Ethical Considerations

This study did not involve human or animal participants, nor did it require ethical review.

Consent to Participate

This study did not involve human participants or personal data requiring informed consent.

Author Contributions

Ping Li: Conceptualization, Writing—Review & Editing; Yanda Hang: Investigation, Software, Formal Analysis, Writing—Original Draft.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data that support the findings of this study are openly available in Zenodo at .

References

Adamchik

V. A.

Bedi

A. S.

(2000). Wage differentials between the public and the private sectors: Evidence from an economy in transition. Labour Economics, 7(2), 203–224.

Baumol

W. J.

(1990). Entrepreneurship: Productive, unproductive, and destructive. Journal of Political Economy, 98(5 Part 1), 893–921.

Botchway

Asiedu

K. F.

(2020). Ownership type and earnings gap decomposition: Evidence from the Ghanaian labor market. African Development Review, 32(4), 619–631.

Cavalcanti

Santos

(2021). (Mis) allocation effects of an overpaid public sector. Journal of the European Economic Association, 19(2), 953–999.

Chen

X. J.

X. H.

(2019). Evolution of wage differences distribution between state-owned and non-state-owned sectors during the reform process: A micro-empirical analysis based on quantile regression and counterfactual quantile decomposition. Journal of Beijing Technology and Business University (Social Sciences), 34(6), 76–87.

Chen

(2019). Talent Misallocation and Innovation: Evidence from China Rencai Wuzhi yu Chuangxin——Laizi Zhongguo de Jingyan Zhengju. World Economic Papers, (6), 71–87.

Chen

Y. A.

(2020). Talent misallocation and income inequality: Impact mechanisms and contribution decomposition. Research on Institutional Economics, 3, 48–73.

Chen

Y. A.

J. Y.

(2019). How government-enterprise talent allocation affects income inequality: Facts and evidence Zhengfu-Qiyejian Rencai Peizhi Ruhe Yingxiang Shouru Bupingdeng: Shishi yu Zhengju. Labor Economics Review, 12(2), 129–148.

El-Haddad

Gadallah

M. M.

(2021). The informalization of the Egyptian economy (1998–2012): A driver of growing wage inequality. Applied Economics, 53(1), 115–144.

10.

Gindling

T. H.

Hasnain

Newhouse

Shi

(2020). Are public sector workers in developing countries overpaid? Evidence from a new global dataset. World Development, 126(C), 104737.

11.

Gunderson

(1979). Earnings differentials between the public and private sectors. The Canadian Journal of Economics, 12, 228–242.

12.

Han

Chen

Liu

(2016). Can “iron rice bowl” be inherited? An empirical study on intergenerational employment transmission in China’s public sector. Economics Dynamic, (8), 61–70.

13.

Imbert

(2013). Decomposing the labor market earnings inequality: The public and private sectors in Vietnam, 1993–2006. World Bank Economic Review, 27(1), 55–79.

14.

Jiang

K. Z.

Pei

Xia

C. M.

(2012). The dynamic evolution of wage differences between public and non-public sectors in China: An empirical study based on CHNS data. Journal of Shanxi University of Finance and Economics, 34(11), 63–74.

15.

Jiang

Mei

H. X.

Y. P.

(2019). The impact of talent allocation between government and enterprises on export technology sophistication: Empirical evidence from China. International Trade and Economic Exploration, 35(1), 35–54.

16.

Krueger

A. B.

(1988). Are public sector workers paid more than their alternative wage? Evidence from longitudinal data and job queues. When public sector workers unionize (pp. 217–242). University of Chicago Press.

17.

Nan

(2019). Why talent flows to the public sector: Economic growth dilemma in slowdown periods and implications of human capital mismatch. Finance & Trade Economics, 40(2), 20–33.

18.

S. S.

(2020). Consumption growth under talent mismatch: How public sector talent expansion affects consumption expenditure. Contemporary Economic Sciences, 42(1), 49–59.

19.

Yin

(2017). Government-enterprise talent allocation and economic growth: An empirical study based on prefecture-level city data in China. Economic Research Journal, 52(4), 78–91.

20.

S. Y.

(2022). Government-enterprise talent allocation and Chinese enterprise export [Doctoral dissertation, Southwestern University of Finance and Economics].

21.

Liu

Zhang

(2020). Low public sector wages and reform prospects: New evidence from a micro perspective. Journal of Finance and Economics, 46(4), 18–32.

22.

X. Z.

Fan

Z. C.

Zhou

L. L.

(2022). Human capital allocation and economic innovation development: An investigation based on competitive and monopolistic dual sectors. Journal of Institutional Economics Research, (1), 139–169.

23.

McFadden

(1974). Conditional logit analysis of qualitative choice behavior. In Zarembka

(Ed.), Frontiers in econometrics (pp. 105–142). Academic Press.

24.

Mincer

(1974). Schooling, experience, and Earnings[R]. National Bureau of Economic Research, Inc.

25.

Nielsen

H. S.

Rosholm

(2001). The public-private sector wage gap in Zambia in the 1990s: A quantile regression approach. Empirical Economics, 26, 169–182.

26.

Panizza

di Tella

Van Rijckeghem

(2001). Public sector wages and bureaucratic quality: Evidence from Latin America. Economía, 2(1), 97–139.

27.

Pederson

P. J.

Schmidt-Sørensen

J. B.

Smith

Westergård-Nielsen

(1990). Wage differentials between the public and private sectors. Journal of Public Economics, 41(1), 125–145.

28.

Rosenbaum

P. R.

Rubin

D. B.

(1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41

29.

Sławińska

(2021). Public–private sector wage gap in a group of European countries: An empirical perspective. Empirical Economics, 60(4), 1747–1775.

30.

Smith

S. P.

(1976). Pay differentials between federal government and private sector workers. ILR Review, 29(2), 179–197.

31.

Stelcner

van der Gaag

Vijverberg

(1989). A switching regression model of public-private sector wage differentials in Peru: 1985-86. Journal of Human Resources, 24(3), 545–559.

32.

Sun

Wang

(2016). Trends in wage premium within China’s public sector: Evidence from China General Social Survey (CGSS). China Labor Economics, 4(4), 73–97.

33.

Sun

(2023). Human capital allocation between public and private sectors and innovation growth: A perspective based on the production nature of public sector. Journal of Institutional Economics Research, (2), 165–194.

34.

Tan

(2019). Talent allocation, innovation, and economic growth: Theory and evidence. Finance and Trade Research, 30(9), 29–42.

35.

Tian

Shen

Y. Y.

(2022). Employment selection effect and wage gap between public and non-public sectors: A study based on the dual-sector model. Nankai Economic Studies, (11), 115–137.

36.

Wan

X. Y.

Tang

Zhang

(2021). Research on labor income gap between public and non-public sectors: Empirical test based on endogenous switching model. Industrial Economic Review, (2), 93–105.

37.

Yao

D. M.

Fan

L. M.

Lin

S. S.

(2016). Is public sector income relatively low? Statistical Research, 33(6), 85–93.

38.

Yin

Z. C.

Gan

(2009). An empirical study on wage differences between public and non-public sectors. Economic Research Journal, 44(4), 129–140.

39.

Zhang

J. W.

Xue

X. X.

(2008). Wage differences between state and non-state sectors and the contribution of human capital in China. Economic Research Journal, (4), 15–25.

40.

Zhang

Y. B.

(2012). The evolution of income differences between public and non-public sectors. Economic Research Journal, 47(4), 77–88.

41.

Zhao

(2002). Earnings differentials between state and non-state enterprises in urban China. Pacific Economic Review, 7(1), 181–197.

42.

Zhou

Wang

(2013). The evolution and decomposition of wage differences between state-owned and non-state-owned sectors: Based on the unconditional quantile regression decomposition method. Economics of Science, (3), 48–60.

43.

Zhu

B. E.

Wei

(2024). Economic effects and policy implications of correcting human capital mismatch in public sector. Journal of Beijing University of Technology (Social Sciences Edition), 24(5), 55–67.