What Drives Reticence? Reporting Bias from Monopolies and Distrustful Firm Managers

Abstract

This article examines the determinants of underreporting (reticence) on randomized response questions. A simple model is created to describe the interview process and draw some conclusion as to why people might misreport their true status despite given the assurance of institutional and statistical confidentiality. By looking at the relationship between firm-specific reticence and other firm-specific and industry-location-specific variables, it is found that (mis)trust (proxied by the proportion of contracts arranged before delivery) is a significant predictor of reticence. Underreporting does not seem to be significantly related to a misunderstanding of the procedure, education, profit levels or guilt. This seems to suggest that firms which are more cautious in their business dealings are also more cautious with the randomized response (RR) technique. In such cases, weighted estimates of the prevalence of sensitive traits might be derived without the use of the RR technique but through the use of variables relating to the nature of firm-level contracts. Moreover, more accurate data on sensitive topics can be extracted from large homogeneous populations.

JEL: C81, C83, D81

Keywords

Trust behavioural economics social behaviour risk reticence

Introduction

The incentive for revealing information about illicit corporate practices might be increased by payments and/or offers of clemency; prosecutors might offer crown witness protection for those willing to reveal sensitive information that helps to prosecute a corrupt agent. The United States Securities and Exchange Commission requires all listed corporations and their subsidiary companies to have anonymous whistle-blower hotlines for company personnel and third parties. Encouraging whistle-blowing is thought to be an effective way of increasing the probability of catching agents involved in corruption. However, whistle-blowers must find credible people to contact, either within a company or outside a company.

Getting good data is crucial in empirical analyses of sensitive topics (Iarossi, 2006). However, many individuals are hesitant to directly reveal sensitive information. In response to the difficulty in getting sensitive data from individuals, researchers use different methods to overcome this barrier to data collection. Some of these methods include direct and indirect questioning methods. The randomized response (RR) method was designed to reduce the social desirability and non-response biases in the revealing of sensitive data. Answering questions via this method removes the stigma attached to revealing sensitive and/or negative information about oneself. This occurs because the response to the sensitive question(s) is determined by a randomizing device, making it impossible to tell if the answer received directly corresponds to the guilt/innocence of the interviewee.

Despite these considerations, the RR technique has failed in eradicating social desirability bias. Despite being assured of statistical anonymity, interviewees refuse to give a response that might implicate them in some wrongdoing or suggest that they do something considered immoral. This hesitance to follow the rules of the RR procedure when those rules might implicate an individual in wrongdoing is labelled ‘reticence’ (Azfar & Murrell, 2009).

Reticence amongst firm managers in the orchestration of the RR technique has been noted by numerous scholars in recent years (Azfar & Murrell, 2009; Clausen, Kraay, & Murrell, 2010; Karalashvili, Kraay, & Murrell, 2015). The idea that managers refuse to disclose information despite guaranteed statistical anonymity has caused researchers to search for reasons for this behaviour (Kundt, Misch, & Nerré, 2016). A natural response to the problem of reticence is to find out what determines the phenomenon and accordingly adjust the responses from RR questions. Statistical methods to describe the prevalence and effects of reticence have been developed (Kraay & Murrell, 2013). The research community also knows that reticent managers respond differently to non-reticent managers when asked non-sensitive survey questions. For example, evidence has been found for reticent managers reporting that they pay their workers higher wages, whilst in reality paying wages no higher than their competitors (Clarke, 2012); also, reticent managers are more likely to answer questions on corruption but less likely to report corruption (Clarke, Friesenbichler, & Wong, 2015).This article investigates the reasons for underreporting of sensitive acts by firms, even when those same firms have protection against being found guilty of those acts.

Whilst previous research has found that reticence increases with competition (Friesenbichler, Clarke, & Wong, 2014), the current study finds that when asking companies RR questions, reticence can decrease with competition. The role of trust and the perceived risk of detection is examined and found to have significant effects on the propensity to report truthfully.The results suggest that some of the factors which affect the nature in which companies engage in trade might also affect the exchange of sensitive information.

This study uses a set of seven sensitive RR questions in order to examine misreporting. Managers were given the instructions for the RR process before receiving the RR questions. The instructions stated that the managers were to privately flip a coin before answering each question. If the result of the flip was ‘Heads’, they were instructed to respond ‘Yes’ to the corresponding question regardless of the truth; if the result was ‘Tails’, they were instructed to respond with the truth. A firm is labelled as reticent if it responds ‘No’ to all seven questions because of the extremely low probability of getting seven Tails and being innocent of each act.¹ Firms which respond ‘Yes’ to at least one sensitive question are labelled as ‘Possibly Candid’ due to the possibility that they might have underreported on at least one question but still show some signs of cooperating with the procedure (Azfar & Murrell, 2009). The sum score is defined as the total number of Yeses that the firm reports. Therefore, reticent firms have a sum score of zero, whilst possibly candid firms have a sum score of at least one. There was no non-response in the sample; all firms gave an answer to all seven sensitive questions.

Framework

Firm managers are given protection from being identified as guilty of an act; this protection is through the statistical anonymity provided by attaching their response to the outcome of a coin toss. Managers are asked a sensitive question while they privately toss a coin and are told to respond ‘Yes’ if they got a ‘Heads’ or if they did the sensitive act in question. This occurs seven times. The same procedure is done for 3 non-sensitive questions making a total of 10 questions. This framework makes it impossible to tell whether a ‘Yes’ is a sign of guilt or a sign of getting a ‘Heads’. Despite this protection from their guilt being identified, firm managers still choose to misreport their status when asked sensitive questions. In this study, the manager is labelled as reticent if he/she answers ‘No’ to all seven non-sensitive questions. This study investigates the reason for reticence using the following potential solutions:

Potential Reasons for Reticence

A Misunderstanding of the Randomized Response Procedure

Despite the instructions that are given, many people are said to not understand the process (Lambsdorff, 2012).² A misunderstanding of the requirements of the procedure might cause people to report ‘No’ every time to a series of sensitive questions even when a ‘Yes’ is required. This misunderstanding might not be unrelated to the general miscommunication that can occur in surveys.

A Lack of Understanding of Probability or Statistics

The RR procedure is meant to protect interviewees from being identified as guilty, therefore allowing them to admit to sensitive acts without the stigma of revealing their true behaviour. If the interviewees understood what was required, they might still have misreported their status because they did not know that the probability of tracing guilt on an individual question back to them is zero. One might describe such a lack of knowledge as asymmetric information about the probability of observing a Yes/No in the RR procedure.

Firm Outcomes

A highly profitable firm might stand to lose a lot if it is prosecuted for business malpractice. Firm managers might be unwilling to admit to sensitive acts if they have a lot to lose by doing so. Therefore, firms with larger sales and profits might be more reticent that unprofitable firms.

Guilt

Firms which have something to hide might choose to be reticent. If this is the case then the reticent firms are guilty of most of the sensitive acts and the possibly candid are less guilty. There is evidence that reticent firms underreport when asked other sensitive questions (Clausen et al., 2010).

Trust

Despite firms being aware of the procedure, they might be suspicious about the whole process and believe that the technique is not safe from being used against them somehow. Firms might think that they are being tricked into saying ‘Yes’ and will be targeted or investigated by law enforcement officials despite a ‘Yes’ not revealing anything about the firm’s behaviour. This might be linked to the psychic cost of admitting that one has done an act which one has not done.

Detection

The surveys were anonymous, however, firms revealed information about themselves that could be used, in conjunction with information about the overall population of firms and their characteristics, to identify them. For example, if a firm is the only wood manufacturing firm of over 99 employees in Lagos state, then that firm would be more easily identifiable than a firm that was amongst 100 other competitors in its industry-location-size cell. Reticence might be a response to this increased perceived probability of detection.

Data and Construction of Variables

This section describes the data used in this study. The data used in this study comes from wave 1 and wave 2 of the World Bank Enterprise Survey for Nigeria, conducted in 2007/2008 and 2008/2009, respectively. The data set used in this study is the same data set that was used in Clausen et al. (2010). Descriptions of the variables used in the analysis and summary statistics are provided in the appendix. Tables 1, 2 and 3 contain descriptions of the data. Variables include owner and manager characteristics, the nature of contracts—the variables relating to trust, the RR questions and the created RR variables. Tables 4 and 5 show summary statistics for the data. Table 6 shows a correlation matrix for some of the variables used in this study.

Trust and Experience in the Business Environment

This study uses three variables as the main measures of trust. These are the percentage of total orders that are written, as opposed to oral with a witness and oral without a witness; the percentage of material orders that are paid for after delivery, as opposed to on delivery or before delivery; and the percentage of sales orders that are paid for before delivery, as opposed to on delivery or after delivery. These variables are chosen because they seem to represent things that a non-trusting company should care about.

This study also uses a dummy variable for whether or not a company subcontracted any part of its production process to another company. Subcontracting work to another company arguably requires a level of trust that the company will perform the work and do it to a certain standard. Also, the number of years that the company has known the primary supplier for its main input is used as an explanatory variable. This is meant to denote the level of mistrust. The years of experience of the senior manager and its squared term are also used to see the effect on reticence of business experience.

In addition, the previously mentioned measures of trust are also measures of whether others trust the manager. For example, whether the managers use a written contract will depend on how much the firm manager trusts the suppliers and how much the suppliers trust the firm managers. In general, for the variables that are used in this study, the trust of the manager towards other managers and the trust of other managers towards the managers are negatively correlated with each other. A manager who is less trusting is more likely to pay for a purchase order (provide a sales order) after receiving the goods (money) whereas the receipt of payment after provision of goods (payment of money before receiving the goods) is a signal of a trusting supplier of goods (purchaser of goods). Nevertheless, the measures of trust used in this study do not relate to trust in an individual supplier.

Table 1.

Data Description

Category	Variable Name	Definition		Measurement
Owner and Manager Characteristics	Owner_male	Dummy = 1 if owner is male		{0,1}
	Owner_age	What is the age bracket of the sole owner or majority shareholder?		{30 or less, 31–45, 46–55, 55 or more}
	Owner_educ	What is the highest level of education of the sole owner or highest shareholder?		See Table 8
	Male_mgr	Dummy = 1 if top manager is male		{0,1}
	Mgr_age	What is the age bracket of the top manager?		{30 or less, 31–45, 46–55, 55 or more}
	Mgr_educ	What is the highest level of education of the top manager?		See Table 8
	Mgr_experience	How many years of experience does the top manager have?		0 ≤ x
Nature of Contracts (Variables Related to Trust)	Sales_paid_before_delivery	Last year, what percentage of your establishment’s sales were:	Paid for before delivery?	Percentage
	sales_paid_on_delivery		Paid for on delivery?		Percentage
	Sales_paid_after_delivery		Paid for after delivery?		Percentage
	Mat_paid_before_delivery	Last year, what percentage of total annual purchases of material inputs or services, were:	Paid for before delivery?	Percentage
	Mat_paid_on_delivery		Paid for on delivery?	Percentage
	Mat_paid_after_delivery		Paid for after delivery?	Percentage
	Orders_written	Last year, what percentage of your customer’s purchase orders were:	Written?	Percentage
	Orders_oral_nowitness		Oral, without witness?	Percentage
	Orders_oral_witness		Oral, with witness?	Percentage
	Intermediate_sales	What percentage of this establishment’s total sales came from selling intermediate products and services used as inputs in purchasers’ production processes?		Percentage
	Principal_buyer	Who was the principal buyer for this establishment’s output?		See Table 12
	Primary_supplier	For how many years have you known the primary supplier of the main input used last year?		0 ≤ x
	Subcontract	Last year, did you subcontract any part of your production?		{0,1}

Source: Author, based on data from World Bank.

Table 2.

Data Description

Category	Variable Name	Definition	Measurement
	Wave	Wave of questionnaire: 1 or 2	{1, 2}
	Intcode	Interviewer	Indicator for each interviewer
	Super	Supervisor	Indicator for each supervisor
	Size: small	Size = small (5–19 employees)	{0,1}
	Size: medium	Size = medium (20–99 employees)	{0,1}
	Size: large	Size = large (100 employees and more)	{0,1}
	Competitors_1	No Competitors	{0,1}
	Competitors_2	1 Competitor	{0,1}
	Competitors_3	2–5 Competitors	{0,1}
	Competitors_4	6 + Competitors	{0,1}
RR Questions	rr_personal_taxes	Have you ever paid less in personal taxes than you should have under the law?	{0,1}
	rr_business_taxes	Have you ever paid less in business taxes than you should have under the law?	{0,1}
	rr_job_app_misstatement	Have you ever made a misstatement on a job application?	{0,1}
	rr_office_phone	Have you ever used the office telephone for personal business?	{0,1}
	rr_promotion	Have you ever inappropriately promoted an employee for personal reasons?	{0,1}
	rr_not_pay	Have you ever deliberately not given your suppliers or clients what was due them?	{0,1}
	rr_lie	Have you ever lied in your self-interest?	{0,1}
	rr_hire	Have you ever inappropriately hired a staff member for personal reasons?	{0,1}
	rr_late	Have you ever been purposely late for work?	{0,1}
	rr_dismissal	Have you ever unfairly dismissed an employee for personal reasons?	{0,1}
RR Variables	understood_rr	Dummy = 1 if interviewee understood how RR questions were working	{0,1}
	Sum Score (7)	Index (0–7) of the sum of 7 sensitive RR questions	{0,1,2,3,4,5,6,7}
	reticence	(Reticence) Dummy = 1 if ‘Sum Score (7)’ = 0, 0 otherwise	{0,1}
	Sum Score (10)	Index (0–10) of the sum of all 10 RR questions	{0,1,2,3,4,5,6,7,8,9,10}
	Sum Score (3)	Index (0–3) of the sum of the less sensitive RR questions	{0,1,2,3}

Source: Author, based on data from World Bank.

Table 3.

Data Description

Category	Variable Name	Definition	Measurement
Owner And Manager Characteristics	oage_1	owner_age = 30 years or less
	oage_2	owner_age = 31–45
	oage_3	owner_age = 46–55
	oage_4	owner_age = 55 and more
	oeduc_1	owner_educ = no education
	oeduc_2	owner_educ = started but did not complete primary school
	oeduc_3	owner_educ = primary school
	oeduc_4	owner_educ = started but did not complete secondary school
	oeduc_5	owner_educ = secondary school
	oeduc_6	owner_educ = vocational training
	oeduc_7	owner_educ = some university training
	oeduc_8	owner_educ = graduate degree (ba, bsc, etc.)
	oeduc_9	owner_educ = mba from university in this country
	oeduc_10	owner_educ = mba from university in another country
	oeduc_11	owner_educ = other post-graduate degree from university in this country
	buyer_1	principal_buyer = your parent company or affiliated establishment	{0,1}
	buyer_2	principal_buyer = large firms (more than 100 workers)	{0,1}
	buyer_3	principal_buyer = medium private firms (20–100 workers)	{0,1}
	buyer_4	principal_buyer = small private firms (less than 20 workers)	{0,1}
	buyer_5	principal_buyer = individuals	{0,1}
	buyer_6	principal_buyer = government or government agencies (including state-owned enterprises)	{0,1}
	buyer_7	principal_buyer = other	{0,1}
	south	Dummy = 1 if region = south, 0 otherwise	{0,1}
	secondary	Dummy = 1 if owner has secondary education, 0 otherwise	{0,1}
	tertiary	Dummy = 1 if owner has tertiary education, 0 otherwise	{0,1}
	manu	Dummy = 1 if firm is in manufacturing sector, 0 otherwise	{0,1}
	retail	Dummy = 1 if firm is in retail sector, 0 otherwise	{0,1}
	wave2	Dummy = 1 if wave = 2, 0 otherwise	{0,1}
	industry-location population	Number of firms (population) in this observation’s industry-location cell	1 ≤ x

Source: Author, based on data from World Bank.

Level of Anonymity and the Fear of Detection

One reason for reticence might be the fear of detection. Reticent companies might, arguably and irrationally, reason that the answers from the RR procedure might be used against them somehow. If so, since the surveys were carried out anonymously, the possibility of detection would be related to the possibility that the firm could be identified by observables. One of these is the number (population) of other companies in the industry-location cell. On the one hand, if a company is the only chemical manufacturing business in Bauchi state, then it might fear that this fact might make it identifiable. On the other hand, if a company is one of a 100 wood manufacturing businesses in Lagos, then it might be more at ease with answering sensitive questions, since the possibility of detection, as described above, would be reduced.

Table 4.

Summary Statistics

	(1)	(2)	(3)	(4)	(5)
VARIABLES	N	Mean	sd	min	max
owner_male	3,101	0.843	0.364	0	1
owner_age	3,100	2.350	0.870	1	4
owner_educ	3,097	4.821	1.830	1	11
male_mgr	492	0.882	0.323	0	1
mgr_age	492	1.884	0.748	1	4
mgr_educ	3,098	4.827	1.744	1	11
mgr_experience	3,197	11.49	7.046	1	45
sales_paid_before_delivery	3,200	33.66	33.63	0	100
sales_paid_on_delivery	3,200	54.34	36.88	0	100
sales_paid_after_delivery	3,200	12.00	20.94	0	100
orders_written	3,200	33.85	41.61	0	100
orders_oral_nowitness	3,200	38.35	42.09	0	100
orders_oral_witness	3,200	27.80	38.79	0	100
intermediate_sales	1,758	7.449	17.68	0	100
mat_paid_before_delivery	3,200	29.83	36.87	0	100
mat_paid_on_delivery	3,200	54.63	41.42	0	100
mat_paid_after_delivery	3,200	15.53	26.00	0	100
primary_supplier	3,174	7.148	6.211	0	151
subcontract	1,761	0.125	0.331	0	1
wave	3,200	1.644	0.479	1	2
intcode	2,062	307.4	143.3	3	509
super	2,062	34.83	24.54	1	205
size: small	3,200	0.792	0.406	0	1
size: medium	3,200	0.204	0.403	0	1
size: large	3,200	0.00344	0.0585	0	1
competitors_1	1,763	0.0312	0.174	0	1
competitors_2	1,763	0.0159	0.125	0	1
competitors_3	1,763	0.185	0.389	0	1
competitors_4	1,763	0.767	0.423	0	1
rr_personal_taxes	3,200	0.503	0.50	0	1
rr_business_taxes	3,200	0.425	0.494	0	1
rr_job_app_misstatement	3,200	0.429	0.495	0	1
rr_office_phone	3,200	0.505	0.500	0	1
rr_promotion	3,200	0.404	0.491	0	1
rr_not_pay	3,200	0.379	0.485	0	1
rr_lie	3,200	0.518	0.500	0	1
rr_hire	3,200	0.407	0.491	0	1
rr_late	3,200	0.478	0.500	0	1
rr_dismissal	3,200	0.361	0.480	0	1
Sum Score (7)	3,200	2.909	1.727	0	7
reticence	3,200	0.129	0.336	0	1
Sum Score (10)	3,200	4.411	2.100	0	10
Sum Score (3)	3,200	1.502	0.958	0	3

Source: Author, based on data from World Bank.

Table 5.

Summary Statistics

	(1)	(2)	(3)	(4)	(5)
VARIABLES	N	Mean	sd	min	max
oage: ≤ 30	3,100	0.154	0.361	0	1
oage: 31–45	3,100	0.452	0.498	0	1
oage: 46–55	3,100	0.284	0.451	0	1
oage: ≥ 55	3,100	0.110	0.313	0	1
oeduc_1	3,097	0.0349	0.183	0	1
oeduc_2	3,097	0.0804	0.272	0	1
oeduc_3	3,097	0.0746	0.263	0	1
oeduc_4	3,097	0.303	0.460	0	1
oeduc_5	3,097	0.161	0.368	0	1
oeduc_6	3,097	0.107	0.309	0	1
oeduc_7	3,097	0.199	0.399	0	1
oeduc_8	3,097	0.0239	0.153	0	1
oeduc_9	3,097	0.00484	0.0694	0	1
oeduc_10	3,097	0.00969	0.0980	0	1
oeduc_11	3,097	0.00161	0.0402	0	1
buyer_1	3,179	0.00723	0.0848	0	1
buyer_2	3,179	0.00849	0.0918	0	1
buyer_3	3,179	0.0242	0.154	0	1
buyer_4	3,179	0.0780	0.268	0	1
buyer_5	3,179	0.862	0.345	0	1
buyer_6	3,179	0.00975	0.0983	0	1
buyer_7	3,179	0.0101	0.0998	0	1
south	3,200	0.462	0.499	0	1
secondary	3,200	0.484	0.500	0	1
tertiary	3,200	0.0709	0.257	0	1
manu	3,200	0.551	0.497	0	1
retail	3,200	0.167	0.373	0	1
wave2	3,200	0.644	0.479	0	1
industry-location population	3,200	153.8	113.8	8	556

Source: Author, based on data from World Bank.

The preceding analysis suggests the use of the number (population) of companies in the industry-location cell as a variable to measure the level of identifiability of the company. The preceding analysis suggests that the larger the population of companies, the lower the propensity to be reticent. This study also includes the number of close competitors, as reported by the company itself, as an explanatory variable. Due to the fact that this variable is self-reported by the company, it might be more closely related to reticence than the total population of firms. The total population of firms is, arguably, less known to the individual company than the number of close competitors. Also, the total population in the industry-location cell might include somewhat different companies in terms of both product and location.

Empirical Methodology

Misclassification of Reticence

The definition of reticence is an interviewee that answers ‘No’ with a positive probability when they are supposed to answer ‘Yes’, when doing so might be interpreted as them having committed a socially undesir- able act. This definition does not require that reticent interviewees always give an untruthful response, but that they do so on at least some occasions. In practice, the measure of reticence that is used is the number of ‘Yes’ responses on a series of RR questions. An interviewee with zero Yeses is classified as reticent, and an interviewee with at least one Yes is classified as possibly candid.

Table 6.

Correlation Matrix

	r	mgr_experience	mgr_exp2	sales_paid_before_delivery	sales_paid_on_delivery	orders_written	orders_oral_nowitness	orders_oral_witness	subcontract	competitors_1	competitors_2	competitors_3	competitors_4
R	1
mgr_experience	0.07***	1
mgr_exp2	0.05**	1.0***	1
sales_paid_before_delivery	0.03*	0.1***	0.08***	1
sales_paid_on_delivery	–0.04*	–0.1***	–0.08***	–0.8***	1
orders_written	0.1***	–0.006	–0.008	0.2***	–0.2***	1
orders_oral_nowitness	–0.01	–0.01	–0.03	–0.07***	0.06***	–0.6***	1
orders_oral_witness	–0.10***	0.02	0.04*	–0.1***	0.2***	–0.5***	–0.5***	1
subcontract	–0.04	–0.04	–0.04	0.04	–0.2***	0.03	0.07**	–0.09***	1
competitors_1	0.1***	0.05*	0.05*	0.03	–0.08***	0.04	–0.04	–0.006	–0.009	1
competitors_2	0.08***	0.02	0.01	0.02	–0.04	–0.003	0.02	–0.02	–0.02	–0.02	1
competitors_3	0.07**	–0.005	–0.005	0.03	–0.007	–0.04	–0.02	0.06**	–0.06*	–0.09***	–0.06*	1
competitors_4	–0.1***	–0.02	–0.02	–0.04	0.05*	0.02	0.03	–0.05*	0.06**	–0.3***	–0.2***	–0.9***	1

Source: Author, based on data from World Bank.

Note: *p<0.05; **p<0.1; ***p<0.001.

Table 7.

Choice for Each Manager

	Interviewer
	A (Good)	B (Bad)
Tell Truth	U_TA	U_TB
Lie	U_LA	U_LB

Source: Author’s own.

One potential problem with trying to find the causes of reticence is that the traditional measure of reticence might be misclassified. Labelling all firms which answer ‘No’ all the time as reticent and the rest of the sample as possibly candid potentially misclassifies some of the reticent as possibly candid. This can be described using the latent variable specification of a binary outcome model (Greene, 1990; McFadden, 1984) and the notation of Hausman, Abrevaya and Scott-Morton (1998), where $y_{i}^{t r u e}$ is a latent variable, and i ranges from 1 to the sample size N. The latent variable can be described by:

y_{i}^{t r u e} = {x^{'}}_{i} β + ε_{i}

where is an independent and identically distributed error term. The observed response can be represented as follows:

y_{i}^{o b s e r v e d} = 1 (y_{i}^{t r u e} \geq 0)

where $y_{i}^{o b s e r v e d}$ is the reported answer, and 1(E) represents the indicator function that is equal to 1 if E is true and 0 if E is false. In the absence of misclassification of the binary variable, the observed response is also the true response.

This study first focuses on the type of misclassification in which the misclassification error depends on the true response, $y_{i}^{t r u e}$ but is independent of the explanatory variables, x_i.

The definition of reticence used previously can be represented as follows:

Table 8.

Tabulation of Sum Score and Median Level of Education of Largest Shareholder of the Companies

Education Level	Sum Score								Average
	0	1	2	3	4	5	6	7
Median Education Level	Secondary School	Secondary School	Secondary School	Secondary School	Incomplete Secondary School	Incomplete Secondary School	Incomplete Secondary School	Incomplete Secondary School	Secondary School
Total	13.04%	9.59%	15.47%	22.02%	21.99%	12.59%	3.87%	1.42%	100.00%

Source: Author, based on data from World Bank.

Notes: N = 3,097. Mean Sum Score = 2.91. Median Sum Score = 3. The options in ascending order, included: No education; started but did not complete primary school; primary school; started but did not complete secondary school; secondary school; vocational training; some university training; graduate degree (ba, bsc, etc.); mba from university in this country; mba from university in another country; other post-graduate degree from university in this country; other post-graduate degree from university in another country.

r = {\begin{matrix} 1 i f s = 0 \\ 0 i f s = 1 \end{matrix}

where r represents reticence and s is the number of Yeses to a series of randomised response questions. This criterion has the possibility that some firms are misclassified as 0 (possibly candid) when in fact they should be 1 (reticent). In such a case, where some 0s should be 1s:

α_{1} = \Pr (y_{i}^{o b s e r v e d} = 0 | y_{i}^{t r u e} = 1) \neq 0,

where α₁ is the false-negative misclassification error. Due to the relatively sensitive nature of the questions and the relatively low probability of getting seven tails from seven coin flips and being innocent of all acts, this study argues that the alternative misclassification error is not significantly different from zero:

α_{0} = \Pr (y_{i}^{o b s e r v e d} = 1 | y_{i}^{t r u e} = 0) .

Despite this, the false-positive misclassification error α₀ will be incorporated into the analysis and tested for significance.

In the standard case of misclassification of a binary dependent variable, the expected value of the observed dependent variable is:

\begin{matrix} E (y_{i}^{o b s e r v e d} = 1 | x_{i}) = \Pr (y_{i}^{o b s e r v e d} = 1 | x_{i}) \\ = α_{0} + (1 - α_{0} - α_{1}) F ({x^{'}}_{i} β) \end{matrix}

(1)

where F is the cumulative distribution function of . The probability of observing a zero is given by:

\Pr (y_{i}^{o b s e r v e d} = 0 | x_{i}) = (1 - α_{0}) + (α_{0} + α_{1} - 1) F ({x^{'}}_{i} β)

which collapses to the usual respective expressions: $F (x_{i}^{'} β)$ and $1 - F ({x^{'}}_{i} β)$ , when there is no misclassification error in the binary dependent variable.

In the present set-up, the measurement error is negatively correlated with the accurately measured variable; therefore, the classification error will lead to a downward bias in the estimates of the effect of x on y ^true (Bound, Brown, & Mathiowetz, 2001). To see this, note that the marginal effect of an explanatory variable on the observed response is as follows:

\frac{δ \Pr (y_{i}^{o b s e r v e d} = 1 | x_{i})}{δ x_{i}} = (1 - α_{0} - α_{1}) f ({x^{'}}_{i} β) β

(2)

which is always less than the marginal effect on the true response, $f (x^{'} β) β$ . Moreover, the marginal effect on the observed response always differs from the marginal effect on the true response by a factor of $(1 - α_{0} - α_{1})$ , no matter at what value of x the marginal effects are evaluated.

The parameters α₀, α₁and β can be estimated via the maximum likelihood method by maximizing:

\begin{matrix} = \frac{1}{n} \sum_{i = 1}^{N} y_{i}^{o b s e r v e d} \ln (α_{0} + (1 - α_{0} - α_{1}) F ({x^{'}}_{i} β)) \\ + (1 - y_{i}^{o b s e r v e d}) \ln ((1 - α_{0}) + (α_{0} + α_{1} - 1) F ({x^{'}}_{i} β)) \end{matrix}

(3)

with respect to α₀, α₁and β.

An additional condition is required for identification of the parameters in the model. This condition, the monotonicity condition, states that the sum of the misclassification errors must be less than 1.

Monotonicity Condition (MC1): α₀ + α₁ < 1.

This can be seen when considering a symmetric function F where F (d) = 1 – F (– d).

Defining: ᾶ₀ = 1 – α₁, ᾶ₁ = 1 – α₀ and $\tilde{β} = - β$ . Then:

\begin{matrix} {\tilde{α}}_{0} + (1 - {\tilde{α}}_{0} - {\tilde{α}}_{1}) F ({x^{'}}_{i} - \overset{˘}{β}) \\ = 1 - α_{1} + (1 - (1 - α_{1}) - (1 - α_{0})) (1 - F ({x^{'}}_{i} β)) \end{matrix}

= 1 - α_{1} + (1 - 1 + α_{1} - 1 + α_{0}) (1 - F ({x^{'}}_{i} β))

= 1 - α_{1} + (α_{1} + α_{0} - 1) (1 - F ({x^{'}}_{i} β))

= 1 - α_{1} + α_{1} + α_{0} - 1 - (α_{1} + α_{0} - 1) F ({x^{'}}_{i} β)

= α_{0} + (1 - α_{0} - α_{1}) F ({x^{'}}_{i} β)

(4)

Therefore, any estimators based on Equation (1), such as maximum likelihood and non-linear least squares, cannot distinguish between (α₀, α₁ and β) and (1 – α₁, 1 – α₀, and – β ). MC1 rules out this possibility because if α₀ +α₁ < 1, then (1 – α₁) +(1 – α₀) > 1.

Since this study argues that α₀ is not significantly different from zero, the MC1 condition implies that α₁ < 1, which appears to be reasonable. The only way in which this could be violated is if α₁ = 1, which does not agree with the data since the actual distribution of ‘Yes’ shows that more Yes are observed than expected under the angels assumption; therefore, at least some respondents appear to be answering truthfully. Using this result, it appears that the MC1 is satisfied in this study and the parameters (α₀, α₁ and β) can be identified using maximum likelihood estimation.

Empirical Results

This section looks at the empirical results of the analysis. Summary statistics from the data are discussed here. Results from an econometric exercise that models the propensity to be reticent are also analyzed.

Education: Lack of Understanding about the Probability of Detection

Results comparing profit levels amongst different sum score categories are shown in Table 9. Mean profit amongst reticent firms was 4,393,000 Naira and median profit amongst the same set of firms was 1,544,000 Naira. Both values are below the averages for the entire sample of 4,414,000 Naira and 1,713,000 Naira, respectively. On average, reticent firms have lower median and mean profits than possibly candid firms which said ‘Yes’ once. Both mean and median profit averages for reticent firms sit around the centre of the distribution of profits for all firms. These results provide evidence against the idea that the reticent firms have more to lose by revealing their business malpractices. Once again, these differences in profits disappear when controlling for the age of the owner, region, wave and industry; thus profits are not included in the probit analysis in Table 9.

Table 9.

Tabulation of Sum Score and Profit (N’000) Levels

Profit Level	Sum Score								Average
	0	1	2	3	4	5	6	7
Median Profit	1,544	1,668	1,894	1,888	1,750	1,297	1,389	1,311	1,713
Mean Profit	4,393	4,850	4,752	4,511	4,184	4,148	3,773	4,164	4,414
Total	12.94%	9.47%	15.72%	21.88%	22.22%	12.56%	3.84%	1.38%	100%

Source: Author, based on data from World Bank.

Notes: N = 3,097. Mean Sum Score = 2.91. Median Sum Score = 3.

Table 10 investigates the idea that reticent firms are the guilty hiding their guilt. The results tend to refute this claim. Reticent firms are more likely to admit to engaging in bribery than all other (possibly candid) firms. This result remains when looking at all other firms as one group and when separating firms by their sum score. 61 per cent of reticent firms admitted to paying a bribe; this figure is significantly higher than the average of 53 per cent for the entire sample. Thus, it appears that firms are reticent for a reason other than trying to hide their guilt.

Table 10.

Tabulation of Sum Score and Bribery

Sum Score	Firm Paid A Bribe In Past Year?		Total
	No	Yes
0 (Reticent)	38.65	61.35	100.00
1	56.44	43.56	100.00
2	51.49	48.51	100.00
3	50.43	49.57	100.00
4	45.85	54.15	100.00
5	41.29	58.71	100.00
6	43.09	56.91	100.00
7	43.18	56.82	100.00
Total	47.09	52.91	100.00

Source: Author, based on data from World Bank.

Summary Statistics

Summary statistics for trust-related variables are shown in Table 11. Different sets of variables are presented within this table. The first set of variables relate to the companies’ orders. Companies were asked to state the percentage of their customers’ purchase orders that were written; oral, with a witness; and oral, without any witnesses. These respective percentages add up to 100 per cent. The second set of variables relate to the annual purchase of material inputs made by the companies. The companies were asked to state the percentage of total annual material inputs or services that were paid for before delivery, on delivery and after delivery. These percentages also add up to 100 per cent. The next set of variables relate to the sales of the establishment. Companies were asked the percentage of sales that were paid for before delivery, on delivery and after delivery. These percentages also add up to 100 per cent. Companies were also asked if they subcontracted any part of their production to another company. Finally, the survey asked the companies the number of years that they had known the primary supplier of the main input used in their production process.

The summary statistics in Table 11 are presented separately for the reticent and the possibly candid. On average, 45.3 per cent of the purchase orders from reticent companies are written. This is in comparison to 32.1 per cent for the possibly candid set of companies. Therefore, using this as a measure of trust, reticent firms seem to trust other companies less than possibly candid firms. The last column shows that the average ratio of written orders to oral-with-witness orders for reticent firms is 2.43,whereas the same ratio for the possibly candid companies is 1.05. This tends to suggest that the reticent are more cautious in their business transactions than the possibly candid group.

Table 11.

Means of Trust-related Variables, by Reticence

			Means	P-value	Relative Shares (Division Of Averages)		Relative Shares (Lower N, due to division by zero)	Reticent
Description	Variable	Possibly Candid	Reticent		Possibly Candid	Reticent	Possibly Candid
Percentage of their customers’ purchase orders that were:	written	32.1	45.3	0.000	1.10	2.53	1.05	2.43
	oral, with witness	29.3	17.9	0.000	1	1	–	–
	oral, without witness	38.6	36.8	0.831	1.32	2.06	0.59	0.87
Percentage of total annual purchases of material inputs or services that were:	paid for before delivery	28.5	38.7	0.000	0.50	0.93	0.73	1.13
	paid for on delivery	56.6	41.4	0.000	1	1	–	–
	paid for after delivery	14.9	19.9	0.000	0.26	0.48	0.50	0.55
Percentage of establishment’s sales that were:	paid for before delivery	33.2	36.7	0.023	0.60	0.72	0.97	0.96
	paid for on delivery	54.9	50.7	0.029	1	1	–	–
	paid for after delivery	11.9	12.7	0.037	0.22	0.25	0.34	0.36
Percentage of establishment’s total sales that came from selling:	intermediate products and services	6.9	11.1	0.024
Did you subcontract any part of your production?	0.13	0.10	0.142
For how many years have you known the primary supplier of the main input used?	7.07	7.70	0.020

Source: Author, based on data from World Bank.

Notes: P–values are from Wilcoxon rank-sum (Mann–Whitney) tests. Wilcoxon signed rank-sum tests for the equality of shares (last 2 columns) reject the null hypothesis of equality for each pair of shares at the 5% level for the first pair, and at the 1% level for all other pairs.

Both groups receive payment for the majority of their sales either before delivery or on delivery. Only 12–13 per cent of their payments are received after delivery. The reticent seem to know the primary supplier of their main input for a year longer than the possibly candid group. A tabulation of principal buyers of output for the reticent and possibly candid, respectively, is shown in Table 12.

Table 13 shows the means of population-related variables. The reticent seem to be in less populous industry-location cells than the possibly candid group. Additionally, the reticent report having fewer competitors.

Predicting Reticence

Tables 14 and 15 show probit estimations for reticence. The dependent variable in every model is a dummy equal to one if a company is reticent (as defined by Azfar & Murrell (2009)) and zero otherwise. Standard errors are calculated using the Huber–White heteroscedasticity-consistent estimator. The first set of variables are the same as those used in Clausen et al. (2010) as predictors of reticence, these are gender of owner, age of owner, formal education of owner, industry of company, size of company, region and wave of survey.

Companies with owners who are 55 years old or above are less likely to be reticent than the base group of 46–55. Companies in manufacturing and retail are also shown to be more reticent than the base group, other, which includes information technology, construction and transport, and hotels and restaurants. Companies located in the southern states of Nigeria are also more likely to be reticent than those in the northern states. Also, companies that carried out the survey in the second wave showed more reticence than those which participated in the first wave.

Table 12.

Tabulation of Trust-related Variables, by Reticence

		Percentage
Variable	Categories	Possibly Candid	Reticent
Who was the principal buyer for this establishment’s output?	Your parent company or affiliated establishments	0.61	1.46
	Large private firms (more than 100 workers)Medium private firms (20–100 workers)Small private firms (less than 20 workers	0.72	1.70
		2.20	3.89
		7.33	10.95
	Individuals	87.10	80.29
	Government or government agencies (including state-owned enterprises)	0.94	1.22
	Others	1.08	0.49
Total		100.0	100.0

Source: Author, based on data from World Bank.

Turning attention to the trust-related variables, the coefficients on the three main measures of mistrust have their expected sign, positive, and are statistically significant when they enter the model separately in Models 3, 4 and 5. The coefficients on percentage of written orders and the percentage of material purchases that were paid for before delivery are both significant at the 1 per cent level. The percentage of sales that were paid for before delivery is significant at the 10 per cent level. Furthermore, both the percentage of written orders and the percentage of material inputs that were paid for before delivery keep their sign and significance when all three variables enter jointly into the model (Model 6). Their size, in magnitude, is also similar to their size in the previous models where they enter separately into the estimation. The coefficient for subcontracting of production has its expected sign but is not statistically significant at the 10 per cent level. The number of years that the company has known its primary supplier also has the expected sign, positive, and is statistically significant at the 10 per cent level. The squared term for the length of time that the primary supplier has been known did not enter significantly into this model and is omitted from this table. These results add support for the idea that companies which are less trusting in their business operations are more likely to be reticent in answering RR survey questions.

Table 13.

Means of Population-related Variables

		Means		P-value
Description	Variable	Possibly Candid	Reticent
Objective Variable	Industry-location population	155	148
Subjective Reports	No Competitors	0.023	0.084	0.000
	1 Competitor	0.012	0.042	0.001
	2–5 Competitors	0.175	0.251	0.001
	6+ Competitors	0.790	0.623	0.005
Total		1	1

Source: Author, based on data from World Bank.

Note: P-values are from Wilcoxon rank-sum (Mann–Whitney) tests.

Turning attention to the variables representing the risk of detection, the population of the company’s industry-location cell had a zero effect on reticence. This effect did not change throughout the models in Table 14. In order to use a potentially better measure of the perceived risk of detection, the models in Table 15 use dummies for the number of competitors that the firm faces, as reported by the firm. These are 1 competitor, 2–5 competitors and more than 5 competitors. The excluded category is—no competitors. Hence, the dummy for 1 competitor is equal to 1 if the firm reported having 1 competitor, and 0 otherwise. The other dummy variables are constructed similarly.

The coefficient on the dummy for 1 competitor has its expected sign but is not significant in any of the models. Therefore, there is some evidence that, relative to the base group of no competitors, having 1 competitor decreases the probability of being reticent. However, this finding is not statistically significant at the 10 per cent level. Nevertheless, the coefficients for the 2–5 and 5+ dummies both have their expected signs and are both statistically significant at the 1 per cent level in most models. Furthermore, the coefficient on the variable for 5+ competitors is always larger, in absolute value, than the coefficient for 2–5 competitors, which, in turn, is also larger, in absolute value, than the coefficient for 1 competitor. This suggests that the more competitors in a market, the less likely a company is to be reticent. These results provide evidence for the perceived risk of detection as a determinant of reticence amongst companies.

Table 14.

Probit Estimates for Reticence

Dependent Variable: reticence	1	2	3	4	5	6	7	8
owner_male	–0.090	–0.083	–0.098	–0.081	–0.087	–0.097	0.129	–0.111
	(0.082)	(0.082)	(0.084)	(0.082)	(0.082)	(0.084)	(0.145)	(0.084)
age: ≤ 30	–0.132	–0.142	–0.110	–0.186*	–0.142	–0.147	–0.081	–0.116
	(0.099)	(0.099)	(0.101)	(0.101)	(0.099)	(0.102)	(0.164)	(0.104)
age: 31–45	–0.033	–0.039	–0.011	–0.051	–0.039	–0.023	–0.057	–0.017
	(0.070)	(0.070)	(0.070)	(0.070)	(0.070)	(0.071)	(0.092)	(0.071)
age: ≥ 55	–0.176*	–0.181*	–0.196*	–0.183*	–0.182*	–0.198*	0.011	–0.242**
	(0.106)	(0.106)	(0.106)	(0.106)	(0.106)	(0.106)	(0.135)	(0.111)
secondary	0.055	0.056	–0.015	0.047	0.051	–0.020	0.123	–0.015
	(0.062)	(0.062)	(0.064)	(0.062)	(0.062)	(0.064)	(0.087)	(0.065)
tertiary	–0.212	–0.209	–0.261	–0.213	–0.203	–0.265	–0.249	–0.247
	(0.173)	(0.173)	(0.174)	(0.174)	(0.173)	(0.176)	(0.283)	(0.175)
manu	0.152**	0.137*	0.182**	0.178**	0.116	0.218***		0.176**
	(0.073)	(0.074)	(0.075)	(0.076)	(0.074)	(0.076)		(0.077)
retail	0.223**	0.203**	0.266***	0.177*	0.202**	0.244**		0.211**
	(0.091)	(0.092)	(0.095)	(0.093)	(0.092)	(0.095)		(0.096)
size: medium	–0.122	–0.111	–0.103	–0.090	–0.099	–0.088	–0.097	–0.107
	(0.078)	(0.079)	(0.080)	(0.080)	(0.080)	(0.081)	(0.109)	(0.082)
size: large	–0.166	–0.142	–0.149	–0.218	–0.122	–0.223		–0.273
	(0.544)	(0.546)	(0.548)	(0.530)	(0.547)	(0.538)		(0.543)
south	0.254***	0.268***	0.221***	0.288***	0.268***	0.241***	0.325***	0.242***
	(0.060)	(0.061)	(0.062)	(0.061)	(0.061)	(0.063)	(0.085)	(0.063)
wave2	0.492***	0.487***	0.585***	0.466***	0.485***	0.564***	0.894***	0.536***
	(0.064)	(0.064)	(0.069)	(0.064)	(0.064)	(0.068)	(0.111)	(0.068)
industry location		–0.000	–0.000	–0.000	–0.000	–0.000	–0.001	–0.000
		(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)
orders_written			0.005***			0.005***	0.006***	0.005***
			(0.001)			(0.001)	(0.001)	(0.001)
mat_paid_after_delivery				0.004***		0.003***	0.009***	0.003***
				(0.001)		(0.001)	(0.002)	(0.001)
sales_paid_before_delivery					0.001*	–0.000	0.005***	–0.000
					(0.001)	(0.001)	(0.001)	(0.001)
subcontract							–0.118
							(0.135)
primary_supplier								0.014**
								(0.006)
Constant	–1.580***	–1.531***	1.783***	–1.602***	–1.561***	–1.838***	–2.431***	–1.898***
	(0.129)	(0.136)	(0.145)	(0.140)	(0.138)	(0.148)	(0.210)	(0.153)
Pseudo R–squared	0.031	0.032	0.054	0.037	0.033	0.058	0.118	0.058
Observations	2100	3100	3100	3100	3100	3100	1699	3073
Log–Likelihood	–1163.8	–1163.3	–1136.7	–1157.6	–1162.0	–1132.5	–604.0	–1112.8
Chi–squared	87.6	88.2	133.3	94.4	91.0	137.6	132.0	137.5
P–value	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000

Source: Author, based on data from World Bank.

Notes: *p<0.10, **p<0.05, ***p<0.001. Dependent variable is a dummy = 1 if firm is reticent, 0 otherwise.

The three main trust variables, percentage of written orders, percentage of prepaid material orders and percentage of post-paid sales orders, all enter significantly into the models in Table 15 and all keep their expected signs. The variables for subcontracting work and length of time that the primary supplier is known for also keep their expected signs but do not enter significantly into these models. Finally, the coefficient on the number of years of experience of the senior manager has its expected sign, positive, and is significant in all models. The squared term for the years of manager experience is negative and significant in all specifications. This suggests that an increased level of experience is initially associated with an increased level of reticence, but that this relationship reverses after a point.

Table 15.

Probit Estimates for Reticence

Dependent Variable: reticence	1	2	3	4	5	6	7	8	9	10
owner_male	–0.090	–0.099	–0.101	–0.012	0.027	–0.000	–0.012	0.032	0.031	0.009
	(0.082)	(0.082)	(0.082)	(0.141)	(0.146)	(0.142)	(0.142)	(0.147)	(0.146)	(0.146)
age: ≤ 30	–0.132	–0.074	–0.061	0.055	0.040	0.028	0.076	0.036	0.050	0.031
	(0.099)	(0.101)	(0.101)	(0.163)	(0.168)	(0.166)	(0.164)	(0.170)	(0.171)	(0.170)
age: 31–45	–0.033	–0.004	–0.008	–0.029	–0.021	–0.023	–0.030	–0.019	–0.009	–0.038
	(0.070)	(0.070)	(0.070)	(0.093)	(0.094)	(0.094)	(0.093)	(0.095)	(0.095)	(0.096)
age: ≥ 55	–0.176*	–0.218**	–0.165	0.072	0.041	0.078	0.082	0.058	0.060	0.038
	(0.106)	(0.108)	(0.109)	(0.140)	(0.140)	(0.141)	(0.140)	(0.141)	(0.141)	(0.144)
secondary	0.055	0.057	0.061	0.286***	0.172**	0.260***	0.272***	0.159*	0.158*	0.162*
	(0.062)	(0.062)	(0.062)	(0.085)	(0.087)	(0.085)	(0.086)	(0.088)	(0.088)	(0.089)
tertiary	–0.212	–0.186	–0.171	–0.079	–0.166	–0.091	–0.071	–0.157	–0.144	–0.132
	(0.173)	(0.173)	(0.173)	(0.271)	(0.273)	(0.278)	(0.269)	(0.277)	(0.277)	(0.276)
manu	0.152**	0.129*	0.125*
	(0.073)	(0.073)	(0.073)
retail	0.223**	0.216**	0.211**
	(0.091)	(0.091)	(0.092)
size: medium	–0.122	–0.109	–0.110	–0.078	–0.133	–0.074	–0.039	–0.089	–0.089	–0.106
	(0.078)	(0.078)	(0.078)	(0.106)	(0.109)	(0.108)	(0.107)	(0.111)	(0.111)	(0.114)
size: large	–0.166	–0.201	–0.170
	(0.544)	(0.560)	(0.552)
south	0.254***	0.252***	0.255***	0.368***	0.311***	0.388***	0.353***	0.329***	0.327***	0.320***
	(0.060)	(0.059)	(0.059)	(0.084)	(0.086)	(0.083)	(0.084)	(0.086)	(0.086)	(0.087)
wave2	0.492***	0.470***	0.468***	0.696***	0.852***	0.724***	0.714***	0.852***	0.846***	0.823***
	(0.064)	(0.064)	(0.064)	(0.100)	(0.112)	(0.102)	(0.101)	(0.111)	(0.112)	(0.111)
mgr_experience		0.012***	0.045***	0.064***	0.057***	0.063***	0.059***	0.054***	0.054***	0.053**
		(0.004)	(0.014)	(0.020)	(0.020)	(0.019)	(0.020)	(0.020)	(0.020)	(0.020)
mgr_exp2			–0.001**	–0.001**	–0.001**	–0.001**	–0.001**	–0.001**	–0.001**	–0.001**
			(0.000)	(0.001)	(0.001)	(0.001)	(0.001)	(0.001)	(0.001)	(0.001)
competitors:1				–0.168	–0.107	–0.079	–0.180	–0.050	–0.057	–0.043
				(0.321)	(0.318)	(0.319)	(0.323)	(0.319)	(0.319)	(0.319)
competitors: 2–5				–0.673***	–0.577***	–0.545***	–0.682***	–0.504**	–0.504**	–0.516**
				(0.202)	(0.202)	(0.198)	(0.204)	(0.203)	(0.203)	(0.204)
competitors:6+				–1.031***	–0.971***	–0.898***	–1.031***	–0.875***	–0.871***	–0.887***
				(0.189)	(0.190)	(0.184)	(0.191)	(0.189)	(0.190)	(0.190)
orders_written					0.007***			0.006***	0.006***	0.006***
					(0.001)			(0.001)	(0.001)	(0.001)
mat_paid_after_delivery						0.009***		0.007***	0.007***	0.007***
						(0.002)		(0.002)	(0.002)	(0.002)
sales_paid_before_delivery							0.005***	0.004***	0.004***	0.004***
							(0.001)	(0.001)	(0.001)	(0.001)
subcontract									–0.104
									(0.137)
primary_supplier										0.002
										(0.009)
Constant	–1.580***	–1.704***	–1.909***	–1.488***	–1.810***	–1.739***	–1.672***	–2.103***	–2.103***	–2.043
	(0.129)	(0.138)	(0.158)	(0.292)	(0.302)	(0.297)	(0.295)	(0.309)	(0.309)	(0.312)
Pseudo R-squared	0.031	0.034	0.037	0.104	0.136	0.117	0.114	0.149	0.149	0.146
Observations	3100	3097	3097	1697	1697	1697	1697	1697	1696	1682
Log-Likelihood	–1163.8	–1158.2	–1155.2	–611.7	–589.5	–602.8	–604.6	–580.8	–580.5	–566.7
Chi-squared	87.6	97.1	103.3	140.5	154.9	154.4	154.3	174.8	175.7	169.1
P-value	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000

Source: Author, based on data from World Bank.

Notes: *p<0.10, **p<0.05, ***p<0.001. Dependent variable is a dummy = 1 if firm is reticent, 0 otherwise.

Using the information from the model with the highest Pseudo R-squared (Model 9), predicted values of the dependent variable are constructed. These values lie between 0 and 1. These values are transformed into predicted values for reticence using the following rule: if they lie below 0.5, they are converted to a 0; if they lie above or equal to 0.5, they are converted to a 1. Here, a 1 means that the firm is predicted to be reticent based on the explanatory variables, and a 0 means that the firm is predicted to be possibly candid based on the explanatory variables. Table 18 shows the results from this exercise. The predicted values are tabulated along with the actual values. Overall, this model is able to predict 86.4 per cent of the actual outcomes correctly. To be sure, these results only apply to the manufacturing sector (since only firms in manufacturing were asked about their perceived number of competitors); however, this provides some evidence in favour of the models presented in this study.

Table 16.

Probit Estimates for Reticence

Dependent Variable: reticence	1	2	3	4	5	6
owner_male	0.031	0.027	0.706	0.069	–0.076	–0.135
	(0.146)	(0.199)	(0.193)	(0.203)	(0.234)	(0.245)
oage:	0.050	0.206	0.256	0.238	0.213	0.220
	(0.171)	(0.227)	(0.222)	(0.227)	(0.267)	(0.328)
oage	–0.009	0.128	0.091	0.121	0.084	0.030
	(0.095)	(0.121)	(0.117)	(0.122)	(0.136)	(0.224)
oage	0.060	0.190	0.090	0.220	0.193	0.003
	(0.141)	(0.187)	(0.185)	(0.188)	(0.210)	(0.370)
secondary	0.158*	0.090	0.086	0.068	0.050	0.306
	(0.088)	(0.113)	(0.110)	(0.113)	(0.130)	(0.206)
tertiary	–0.144	–0.152	–0.208	–0.113	–0.265	0.653
	(0.277)	(0.365)	(0.379)	(0.352)	(0.337)	(0.459)
size: medium	–0.089	0.122	0.123	0.097	–0.017	–0.240
	(0.111)	(0.144)	(0.139)	(0.148)	(0.162)	(0.245)
south	0.327***	0.056	0.025	0.068	0.125	0.711***
	(0.086)	(0.113)	(0.112)	(0.115)	(0.126)	(0.209)
wave2	0.846***
	(0.112)
mgr_experience	0.054***	0.059**	0.060***	0.059**	0.045	0.176**
	(0.020)	(0.023)	(0.023)	(0.024)	(0.030)	(0.086)
mgr_exp2	–0.001**	–0.001*	–0.001**	–0.001*	–0.001	–0.007**
	(0.001)	(0.001)	(0.001)	(0.001)	(0.001)	(0.003)
competitors:1	–0.057	–0.375	–0.250	–0.387	–0.508	5.142***
	(0.319)	(0.420)	(0.395)	(0.428)	(0.496)	(0.730)
competitors:2–5	–0.504**	–0.660**	–0.562*	–0.675**	–0.927***	4.078***
	(0.203)	(0.304)	(0.294)	(0.312)	(0.355)	(0.308)
competitors:6+	–0.871***	–0.891***	–0.805***	–0.904**	–1.162***	3.655***
	(0.190)	(0.289)	(0.278)	(0.297)	(0.348)	(0.265)
orders_written	0.006***	0.007***	0.007***	0.007***	0.008***	–0.001
	(0.001)	(0.002)	(0.001)	(0.002)	(0.002)	(0.002)
mat_paid_after_delivery	0.007***	0.004	0.006*	0.005	0.004	–0.001
	(0.002)	(0.003)	(0.003)	(0.003)	(0.004)	(0.005)
sales_paid_before_delivery	0.004***	0.006***	0.005**	0.006***	0.007***	0.006**
	(0.001)	(0.002)	(0.002)	(0.002)	(0.002)	(0.003)
subcontract	–0.104	–0.232	–0.185	–0.225	–0.255	–0.047
	(0.137)	(0.176)	(0.182)	(0.174)	(0.191)	(0.271)
reason_govt						–0.160
						(0.226)
Constant	–2.103***	–2.050	–1.501**	–2.024***	–1.189*	–6.919***
	(0.309)	(0.592)	(0.643)	(0.577)	(0.714)	(0.756)
Interviewer Dummies	NO	YES	NO	YES	YES	NO
P-value		0.000		0.0163	0.9037
Supervisor Dummies	NO	NO	YES	YES	YES	NO
P-value			0.000	0.4004	0.9670
Interviewer–Supervisor Dummies	NO	NO	NO	NO	YES	NO
P-value					0.9481
Pseudo R-squared	0.149	0.266	0.274	0.274	0.228	0.141
Observations	1696	927	1083	927	661	564
Log-Likelihood	–580.5	–356.9	–378.4	–352.6	–314.0	–108.0
Chi-squared	175.7	211.3	218.5	221.0	191.3	498.9
P-value	0.000	0.000	0.000	0.000	0.000	0.000

Source: Author, based on data from World Bank.

Notes: *p<0.10, **p<0.05, ***p<0.001. Dependent variable is a dummy = 1 if firm is reticent, 0 otherwise.

Table 17.

HAS-probit Estimations for Reticence

Dependent Variable: reticence	(9)	(HAS-probit)	(9-Marginal Effects)	(HAS-probit-marginal Effects)
owner_male	0.026	0.023	0.005	0.006
	(0.146)	(0.155)	(0.026)
oage:	0.056	0.080	0.010	0.013
	(0.171)	(0.191)	(0.033)
oage	–0.001	0.005	–0.000	–0.000
	(0.095)	(0.111)	(0.017)
oage	0.067	0.068	0.012	0.015
	(0.140)	(0.160)	(0.027)
secondary	0.152*	0.179	0.028*	0.036
	(0.088)	(0.112)	(0.016)
tertiary	–0.164	–0.170	–0.027	–0.034
	(0.276)	(0.288)	(0.040)
size: medium	–0.081	–0.077	–0.014	–0.018
	(0.111)	(0.124)	(0.019)
south	0.321***	0.385**	0.060***	0.076**
	(0.086)	(0.166)	(0.017)
wave2	0.851***	0.937***	0.130***	0.166***
	(0.112)	(0.206)	(0.014)
mgr_experience	0.053***	0.059**	0.010***	0.013**
	(0.020)	(0.026)	(0.004)
mgr_exp2	–0.001**	–0.001*	–0.000**	–0.000*
	(0.001)	(0.001)	(0.000)
competitors: 1	–0.028	0.003	–0.005	–0.006
	(0.319)	(0.406)	(0.055)
competitors: 2–5	–0.476**	–0.534*	–0.071***	–0.090*
	(0.202)	(0.282)	(0.025)
competitors: 6+	–0.844***	–0.947***	–0.198***	–0.252***
	(0.189)	(0.317)	(0.054)
orders_written	0.006***	0.006***	0.001***	0.001***
	(0.001)	(0.002)	(0.000)
mat_paid_after_delivery	0.007***	0.008**	0.001***	0.001**
	(0.002)	(0.004)	(0.000)
sales_paid_before_delivery	0.004***	0.005**	0.001***	0.001**
	(0.001)	(0.002)	(0.000)
subcontract	–0.100	–0.114	–0.017	0.022
	(0.137)	(0.155)	(0.022)
Constant	–2.124***	–2.074***
	(0.308)	(0.382)
alpha₀		0.000
		(Imposed)
alpha₁		0.215
		(0.336)
Pseudo R-squared	0.148		0.148
Observations	1704	1704	1704
Log-Likelihood	–582.2	–582.1	–582.2
Chi-squared	174.7	30.4	174.7
P-value	0.000	0.033	0.000

Source: Author, based on data from World Bank.

Notes: *p<0.10, **p<0.05, ***p<0.001. Dependent variable is a dummy = 1 if firm is reticent, 0 otherwise.

Table 18.

Comparing Actual Outcomes with Predicted Outcomes from Probit (Model 9) Estimations

Predicted	True Outcomes		Total
	Possibly Candid	Reticent
Possibly Candid	1448	218	1666
Reticent	13	17	30
Total	1461	235	1696

Source: Author’s own.

Controlling for Interviewer and Supervisor Effects

One potential cause of observed reticence amongst companies is the interviewer. It is possible that some firms chose not to answer honestly because they did not trust the interviewer that was asking the questions. Another possibility is that the interviewer did not understand the process. This also might have affected the responses of the companies. Finally, the interviewers had supervisors to make sure that the interview was being conducted correctly. It is possible that the supervisors had an effect on the reticence of the companies. To examine whether or not the interviewers and/or their supervisors had any effect on the reticence of companies, this study uses dummies for both interviewers and supervisors, respectively. Also, an interaction term is included for each interviewer–supervisor combination. These variables are only available for the second wave of the survey. Results of this analysis are shown in Table 16. The first column of results (Model 1) shows the coefficients from Model 9 of Table 15. When all three variables are included (Model 5), Wald tests for the equality of the coefficients on interviewer, supervisor, and combinations of interviewers and supervisors, respectively, are conducted. These tests fail to reject the null of equality of the respective coefficients at the 10 per cent level. This suggests that there were no statistically significant effects of interviewers, supervisors or combinations of interviewers and supervisors on the reticence of companies. However, the results in columns 2 and 4 in Table 16 suggest that interviewer effects are important. Including the interaction terms might be asking too much from the data. Nevertheless, these considerations do not affect the main results of the study.

Controlling for Potential Political Connections

Although there seems to be no observable relationship between guilt and reticence, there might be a possible relationship between companies with political connections and reticence. One potential scenario is that a company with very strong political connections would have less reticence because they could potentially admit to engaging in any act and get away with it due to their links with powerful people. Alternatively, a company with strong links with the government might want to hide all acts of wrongdoing so as to protect their associates in the government. Companies in the first wave of the survey were asked to give the reason that they chose to locate in their particular state. Multiple answers were allowed. One of these was that the government gave concessions and benefits which made it more attractive to locate there. In order to test the political connections hypothesis, this study uses the response to this question as a proxy for the political connections of the company. The result of this exercise is shown in Model 6 of Table 16. There seems to be no statistically significant relationship between this measure of political connectedness and reticence.

Controlling for Misclassification Error Using the HAS-probit Model

This study now turns to the potential problem of misclassifying some of the reticent interviewees as possibly candid. Using the methodology explained under ‘Misclassification of Reticence’ and in Equation (3), estimates are derived for the coefficients on the variables which are believed to affect reticence and also for the misclassification errors—α₀ and α₁. Results are shown in Table 17. Table 17 shows estimates from the ordinary probit estimation and the misclassification-adjusted probit estimates (labelled HAS-probit). Table 17 also shows the marginal effects from both models.

The ordinary probit model used in Table 17 is Model 9 from Table 15. Coefficients for the HAS-probit model are similar to those for the probit model. Imposing the restriction that the probability of observing a false positive is 0, the estimate of the probability of a false negative is 0.216. This is taken to mean that possibly 21.5 per cent of the sample were misclassified into the possibly candid group when they were in fact reticent. This value is consistent with the previous result that the proportion of reticent firms must be at least 16.9 per cent. Furthermore, using this result suggests that the level of guilt required to give rise to the observed distribution for the number of Yes must be 6.0 per cent.³

As expected, the absolute value of the marginal effects from the HAS-probit model is larger than those of the ordinary probit model. Results from the HAS-probit model suggest that a 10 percentage point increase in the percentage of written orders is associated with an increased probability of being reticent of 0.01 probability points. This also applies to the percentage of material paid for after delivery and the percentage of sales paid for before delivery. The coefficients on these three variables have the same sign, the same sized marginal effect and are all significant at (at least) the 5 per cent level.

The largest marginal effect from the HAS-probit model came from the dummy variable for having more than 5 close competitors. A company with more than 5 perceived close competitors is 25 percentage points less likely to be reticent compared to a company with no perceived close competitors.

Conclusion

The aim of this study was to investigate the factors that drive reticence amongst companies when answering RR questions. The impact of misunderstanding, ignorance, firm outcomes, guilt, trust and the risk of detection were examined, respectively. Evidence was found for both trust and perceived detection to influence reticence; no evidence was found for the other variables to be significantly related to reticence. This seems to suggest that companies are more willing to exchange potentially sensitive information with another agent if they have trust in that agent and if there is relatively little chance of them being identified. In the context of corruption, this suggests that people are more likely to admit to something if there is less of a chance that the information they pass on can/will be used against them. These results imply that more accurate data might be acquired from large populations (where the members of the populations are aware of the large size of the population).

Footnotes

Acknowledgements

The author acknowledges Richard Dickens, Alan Winters and participants at the 2012 EUDN PhD workshop for useful comments. This study was conducted whilst the author was at the Department of Economics in the University of Sussex; the author thanks the University and the Department for letting the author use their resources.

This work was carried out with support from the Economic and Social Research Council. The author thanks the team at the World Bank Enterprise Survey Unit for making the data available. He also thanks participants of the Economics of Corruption course, University of Sussex DPhil Seminar, University of Sussex Research in Progress Seminar, Spring Meeting of Young Economists and the European Survey Research Association for their useful comments. All remaining errors are my own.

Appendix

Notes

References

Azfar

, & Murrell

(2009). Identifying reticent respondents: Assessing the quality of survey data on corruption and values. Economic Development and Cultural Change, 57(2), 387–411.

Bound

, Brown

, & Mathiowetz

(2001). Measurement error in survey data. In Heckman

J.J.

& Leamer

(Eds), Handbook of econometrics (Vol. 5). Amsterdam, The Netherlands: Elsevier Science B.V.

Clarke

G.R.

(2012). Do reticent managers lie during firm surveys? Available at SSRN 2028725. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2028725

Clarke

G.R.

, Friesenbichler

K.S.

, & Wong

(2015). Do indirect questions reduce lying about corruption? Evidence from a quasi-field experiment. Comparative Economic Studies, 57(1), 103–135. Retrieved 1 November 2012, from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2571273

Clausen

, Kraay

, & Murrell

(2010). Does respondent reticence affect the results of corruption surveys? Evidence from the World Bank enterprise survey for Nigeria. Washington, U.S.A.:World Bank.

Conchie

S.M.

, Taylor

P.J.

, & Charlton

(2011). Trust and distrust in safety leadership: Mirror reflections? Safety Science, 49(8), 1208–1214.

Friesenbichler

, Clarke

, & Wong

(2014). Price competition and market transparency: Evidence from a randomised response technique. Empirica, 41(1), 5–21.

Greene

W.H.

(1990). Econometric analysis. New York, NY: Macmillan.

Hausman

J.A.

, Abrevaya

, & Scott-Morton

F.M.

(1998). Misclassification of the dependent variable in a discrete-response setting. Journal of Econometrics, 87(2), 239–269.

10.

Iarossi

(2006). The power of survey design: A user’s guide for managing surveys, interpreting results, and influencing respondents. Washington, U.S.A.: World Bank Publications.

11.

Karalashvili

, Kraay

, & Murrell

(2015). Doing the survey two-step: The effects of reticence on estimates of corruption in two-stage survey questions. (World Bank Policy Research Working Paper No. 7276). Washington, U.S.A.: World Bank.

12.

Kraay

, & Murrell

(2013). Misunderestimating corruption. Review of Economics and Statistics, 98(3), 455–466.

13.

Kundt

T.C.

, Misch

, & Nerré

(2016). Re-assessing the merits of measuring tax evasion through business surveys: An application of the crosswise model. International Tax and Public Finance, 24(1), 112–133.

14.

Lambsdorff

J.G.

(2012). Lecture on the Economics of Corruption: The Behavioral Limits of Dishonesty. Germany: University of Passau.

15.

McFadden

(1984). Econometric analysis of qualitative response models. In Griliches

& Intriligator

M. D.

(Eds), Handbook of econometrics (Vol. 4). Amsterdam: North-Holland.

16.

Sah

(2007). Corruption across countries and regions: Some consequences of local osmosis. Journal of Economic Dynamics and Control, 31(8), 2573–2598.