Abstract
The big data sources of National Statistical Offices (NSOs) are provided to make a superior platform for decision-making. The household income and expenditure survey is one of the economically important surveys especially when the inflation rate varies to assess the changes in households’ consumption patterns. In this case, big data can be beneficial and help to accurately measure consumption patterns of urban and rural households at every geographical level. This analysis is an exploratory study for the extraction of the size of injustice and imparity of household income and facilities implemented by classifying and clustering all Iranian households. Through this study, classification and soft clustering (Fuzzy clustering) techniques are employed to characterize the Iranian household types from 2011 to 2021, which are supervised and unsupervised approaches, respectively. Moreover, association rule mining techniques are employed to discover and extract consumption patterns for each cluster. Obtained results showed that there was a significant gap between purchasing power/receiving energy between lowest and highest income households from 2011 to 2021, and this gap is increasing day by day.
This study employs supervised and unsupervised data mining techniques to characterize the Iranian household types. In this paper, at first, the clusters of Iranian urban households were characterized by soft clustering techniques. Also, the consumption patterns for each cluster were extracted by association rules techniques. Finally, the result for households with the lowest and highest income in 2011 and 2021 was analyzed to explore the differences.
Introduction
Based on existing literature the theoretical description of the use of big data, classification, clustering and association rules are as follows:
Data mining algorithms are the processes of discovering patterns in data sets involving methods in machine learning, statistics, and database systems fields [19]. Data mining is an interdisciplinary subfield of computer science and statistics [10]. Data mining is the analysis step of the “knowledge discovery in databases” process or KDD [5]. Detecting households with Plug-in Electric Vehicles (PEVs) and providing customized PEV charging incentives has been done by big data analysis [20]. Recently, the pivotal role of social differentiation in consumer behavior has realistically remained an attractive area of research [7, 11, 13, 14]. Therefore, policymakers are interested in understanding consumer behavior which can be derived from household total consumption patterns while household consumption behavior can be analyzed in the time, user, and spatial dimensions. The economic paradigm (including demand response) and the behavioral paradigm (including intervention strategies) are two major lines of research in household consumption behavior. Household consumption big data have the “4V” characteristics, namely volume, velocity, variety, and value as follows:
Volume
Literally, big data means lots of data. There are different types of surveys that produce data in big volume and household income and expenditure is no exception. The household income and expenditure survey (HIES) has been carried out by a sample of 18701 households in urban areas and 19261 households in rural areas. The survey detailed results are of 234 tables in two separate publications for urban and rural areas. The HIES target population includes all private and collective settled households in urban and rural areas. A three-staged cluster sampling method with strata is used in the Survey. At the first stage, the census areas are classified and selected. At the second stage, the urban and rural blocks are selected and the sample is selected from the household frame at the third stage. In order to meet the surrey’s predetermined goals, the number of samples is adjusted to reach optimization of households’ annual income and expenditure estimation.
Indeed, to obtain estimates more representative of the whole year, the samples are fairly distributed among the months of the year. This sampling pattern results in better data storage and overall sense.
Velocity
Velocity mainly means the speed of daily household consumption pattern of collection, processing, and analysis. According to the volume of the sample and the number of variables in the questionnaire, the number of microdata is produced by household members has increased dramatically which needs fast data collection and processing.
Variety
Household consumption big data have a high degree of variety. Generally, it could be a combination of structured (e.g., the consumption data from the health sector, industry sector, etc.), semi-structured (e.g., data exchanged between smart consumption management platform and third-party data aggregators using XML, Web services, and surveys), and unstructured data (e.g., email or SMS notification interactions of consumers on social media about their consumption use). In addition, there are also some inter-industry data (energy sector such as e.g., electric vehicle-related data) and outside-industry data (e.g., household expenditure on recreation and culture data) in the household consumption dataset. The combination of these dataset results in a significant increase in the complexity of household consumption big data applications.
Value
Household consumption dataset is meaningless unless its value is explored and mined, to support either the managers or policymakers. An informative simplified example is finding derived information about changes in purchasing power from household consumption dataset. By identifying and filling the gaps between the community, policymakers have been able to utilize the advantages of these dataset analysis.
In light of these big data sources, three different techniques were applied in this paper including:
Classification Soft Clustering (Fuzzy Clustering) Association Rules
Classification
The subject of classification is concerned with the investigation of the relationships within a set of “objects” in order to establish whether or not the data can validly be summarized by a small number of classes of similar objects [6].
Soft clustering
Fuzzy c-means (FCM) clustering was developed by J.C. Dunn in 1973,[4] and improved by J.C. Bezdek in 1981 [2]. In the literature, for comparing pairs of objects with imprecise, i.e. fuzzy, information, several proximity measures (dissimilarity, similarity and distance measures) have been suggested [3].
On the other side, in non-fuzzy clustering (also known as hard clustering), data is divided into distinct clusters, where each data point can only belong to an exact cluster whereas, in fuzzy clustering, data points can potentially belong to multiple clusters. Membership grades are assigned to each of the data points (tags). Furthermore, these membership grades indicate the degree to which data points belong to each cluster. Thus, points on the edge of a cluster, with lower membership grades, may place in the cluster with a lower degree in comparison to the central points in the cluster. For example, in Fig. 1, there are some points in the common area of clusters (the violet points) which are belong to both blue and red clusters simultaneously.
Soft clustering.
In the sequel, the Xie-Beni index (involving the membership values and the dataset) also called the compactness and separation validity function would be introduced, which is a cluster validation in the evaluation to strengthen the results [21]. Consider a fuzzy partition of the data set
Also, for a cluster
Then the XB index is defined as
where
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interest [15]. This rule-based approach also generates new rules as it analyzes more data. The ultimate goal, assuming a large enough dataset, is to help a machine mimic the human brain’s feature extraction and abstract association capabilities from new uncategorized data. An approach to dealing with domain knowledge in data mining with association rules has been introduced [16].
Technically, in data mining process, association rules are created by analyzing data for frequent if/then patterns, then using the support and confidence criteria to locate the most important relationships within the data. Support is how frequently the items appear in the database, while confidence is the number of times if/then statements are accurate.
.
An itemset is a set of items. Each item is an attribute value.
In the case of market basket analysis, an itemset would contain a set of products such as cake, Pepsi, and milk. In the case of customer demographic exploration, an itemset would contain a set of attribute values such as {Gender
.
Support is used to measure the popularity of an itemset. Support of an itemset A, B is made up of the total number of transactions that contain both A and B, and is defined as follows:
Support ({A, B})
where N is total number of transactions.
.
Confidence (Probability) is the property of an association rule. The confidence of a rule A
Confidence (A
.
Lift is also called an interesting score (or the Importance in some literature). The lift can be used to measure itemsets and rules. The Lift of an itemset is defined using the following formula:
Soft clustering.
If a rule had a lift of 1, it would imply that the occurrence probability of the antecedent and that of the consequent are independent of each other. When two events are independent, no rule can be drawn involving those two events. If the lift is greater than 1, it depicts the degree that two occurrences are independent, and makes these rules potentially useful for predicting the consequence in future datasets. The lift value is consideration of both confidences of the rule and the overall dataset. For illustration, calculations of theses cases brought in Fig. 2.
The Iranian Household Income and Expenditure Survey (HIES) has been implemented in rural and urban areas since 1963 and 1968 respectively. In the beginning, it only contained questions for household expenditures, but in 1974, the questions concerning household income were added to the survey’s questionnaire. The HIES aims to estimate the average income and expenditure for urban and rural households at provincial and national levels. It shows the household income and expenditure composition and distribution patterns, household consumption patterns, and weight for each commodity in the household consumption basket and helps us to calculate the poverty line and to study the imparity in household income and facilities. Initially, the 2010 HIES was carried out by the Statistical Centre of Iran with a sample of 18701 households in urban areas and 19584 households in rural areas. The detailed results of the Survey, including 234 tables in two separate publications for urban and rural areas, have been available to the interested users, planners and researchers. The HIES target population includes all private and collective settled households in urban and rural areas. A three-stage cluster sampling method with strata has been used in the Survey. In the first stage, the census areas are classified and selected. In the second stage, the urban and rural blocks are selected and the selection of sample households is done in the third stage. The number of samples is optimized to estimate the average yearly income and expenditure of the sample household in line with the survey goals. In order to obtain estimations as representative of the whole household in the year, the samples are evenly distributed among the months of the year.
In this paper, at first, the clusters of Iranian urban households were characterized by soft clustering techniques. Also, the consumption patterns for each cluster were extracted by association rules techniques. Finally, the result for households with the lowest and highest income in 2011 and 2021 were analysed for explore the differences.
Methodology
Governments and National Statistical Offices (NSOs), have been faced with a huge bulk of administrative data on daily basis. It is due to the fact that the government should keep track of various residences records and as a result, there have been many databases for each aspect of daily life like population, energy, geographical resources, households income and expenditure and labour force. All this data contributes to big data. The proper study and analysis of this data, hence, helps governments in endless ways. In light of this, researchers from the analysis of big data to discover patterns and associations in order to identify and examine the expected or unexpected occurrences. Classification, soft clustering and association rules are a very important pursuit among the descriptive models. In this study, the main techniques that were applied on data to characterize Iranian households were classification and clustering analysis. To organize, summarize, and derive insights into the data, classification and clustering analysis were carried out by forming natural groupings of a set of patterns, points, or objects with existence of predefined class labels for classification, and also without any labels for clustering [8]. There are a lot of the algorithms regarding classification and clustering analysis that are classified in a number of different ways such as partitioning, hierarchical, density-based, grid-based and soft methods [9]. The most widely used and well-known classification and clustering approaches are partitioning and hierarchical methods. Partitioning methods, which are the most used and well-known classification/clustering methods, simply divide n objects into a predetermined number of groups. K-means and K-medoids algorithms are among partitioning algorithms, which maximizes the similarity measure among objects within a class/cluster whereas maximizes it among the other classes/clusters. The hierarchical methods create a hierarchical decomposition of the given set of data objects in two basic approaches, either agglomerative or divisive.
Classification approaches
Despite the Gini coefficient index in Iran has decreased during decade 2018 to 2021, the inequalities has not decreased in a affirmative direction. It means that the number of poor households has increased. The Gini coefficient index in Iran has been shown in Fig. 3. Reducing the Gini index is appropriate from the point of view of reducing the difference, but the poor should be made rich, not the rich be poor.
Gini-Index.
Purchasing power of classes.
Classification of urban households have been done by deciles during 2011 to 2021 based on costs. Figure 4 illustrate the nominal cost of some basic goods (meat, rice, oil, etc.) for different deciles. The comparison between poor and rich households is presented by expenditure deciles. The purpose of Fig. 4 is to illustrate the consumption and lifestyle changes among households by deciles. Are consumption and healthy lifestyle of poor households different from rich households during of this period? Inflation has a direct impact on the differences in the quantities consumed seen between the deciles. Also, Fig. 4 shows the effect of inflation on the consumption cost of households and depict changes of consumption distribution among the rich and the poor expenditure-decile in terms of quantity.
In non-fuzzy clustering (also known as hard clustering), data is divided into distinct clusters, where each data point can only belong to exactly one cluster. In fuzzy clustering, data points can potentially belong to multiple clusters. Membership grades are assigned to each of the data points (tags). These membership grades indicate the degree to which data points belong to each cluster. The Expectation Maximization (EM) cluster-assignment method uses a probabilistic measure (rather than a strict distance measure) to determine which objects belong to which clusters. Instead of choosing a point for each dimension and computing a distance, the EM method considers a bell curve for each dimension with a mean and standard deviation. As a point falls within the bell curve, it is assigned to a cluster with a certain probability. Because the curves for various clusters can (and do) overlap, any point can belong to multiple clusters, with an assigned probability for each. This technique is considered soft clustering because it allows clusters to overlap with indistinct edges. This method permits the clustering algorithm to find non-disjoint clusters. Thus, points on the edge of a cluster, with lower membership grades, maybe in the cluster to a lesser degree than points in the center of cluster.
In this paper, we have selected Silhouette cluster validity criterion index to find the optimal number of clusters, which in turn implies that the right number of clusters is 10 (
For each observation For all other clusters Finally the silhouette width of the observation
Silhouette width can be interpreted as follows:
Observations with a large A small Observations with a negative
The performance measure for evaluating clustering algorithms result is the silhouette measure, which unites the concepts of cluster cohesion and cluster separation.
Moreover, the Xie-Beni index is a cluster validation in the evaluation to strengthen the results. where the number of clusters are 10,
The second main technique of the study is an association rule mining analysis which is the derivation of interesting patterns out of the data, Piateski and Frawley state that patterns are interesting when they are “novel, useful, and nontrivial to compute” [15]. The most typical example of association analysis is the market basket analysis and the rules are developed based on this application [1]. Apriori and FP-growth are the main algorithms used for the setting [3]. As a market basket analysis procedure, the set of overall monthly household income and expenditure records is the best candidate for association rule mining when considered as market transaction data. These association rules will allow the respondent to have a very vivid vision of the main causes of changes in consumption in Iran. Implementing the Apriori algorithm on each cluster in this step is principally intended to discover strong and interesting rules that guide researchers to set up special investigations for the differentiated household types.
Relational database (2011).
In this study, a relational database of Iranian Households Income and Expenditure Survey [Urban] has been developed. Using the SQL software, the summary data of 2011 HIES in the form of a relational database is defined in Fig. 5, shows the schema of the relational database of Iranian households income and expenditure. The 15 dimensions are shown in the figure: Sum-U90 as fact table and U90P301 to U90P314 as nested tables. Nested tables from U90P301 to U90P314 contain expenditures on food, drink and tobacco, clothing and footwear, rent and fuel, furniture, household equipment and operation, medical care and health, transport and communication, recreation, education and other consumption expenditure and so on. These are dimension [nested] tables. The Sum-U90 table contains the household demographic and economic information such as age, education, gender, income, size of household, occupation status, type of dwelling, dwelling facilities, the status of property, heating system, premises, vehicles owned and so on. It is a fact table with many columns. The relationship between these fifteen tables is 1 to n.
It is worth mentioning that the right number of clusters is 10. The Cluster Relationship, as shown in Fig. 6, on the other hand, displays each cluster as a single node. These nodes are scattered across a field and group automatically based on similarities. The resultant view is a diagram indicating which clusters are similar or dissimilar through the lines which show the relative strength of the similarities. Hence, there is the strongest relationship between these clusters in Fig. 6.
Clusters relationship (2011).
Relational database (2016).
Relational database and clusters for the 2016 HIES are found in the exact same way in Fig. 7, shows the schema of the relational database of Iranian households income and expenditure.
The 15 dimensions are shown in the figure: Sum-U96 as fact table and U96P301 to U96P314 as nested tables. The tables from U96P301 to U96P314 contain expenditures on food, drink and tobacco, clothing and footwear, rent and fuel, furniture, household equipment and operation, medical care and health, transport and communication, recreation, education and other consumption expenditure and so on. These are dimension [nested] tables.
The Sum-U96 table contains the household demographic and economic information such as age, education, gender, income, size of household, occupation status, type of dwelling, dwelling facilities, the status of property, heating system, premises, vehicles owned and so on. It is a fact table with many columns. The relationship between these fifteen tables is 1 to n.
Clusters relationship (2016).
The Cluster Relationship, as shown in Fig. 8, displays each cluster as a single node. The resultant view is a diagram indicating which clusters are similar or dissimilar and the relative strength of these similarities. Hence, there is the strongest relationship between these clusters in Fig. 8.
Urban households were characterized in ten clusters by the meaning of soft clustering on variables such as cost, income, the average age of household head, occupation, residence type, Internet access, vehicles and family size in 2011 and 2016, separately.
It is proved that poverty and welfare are two interconnected concepts in economics. That is, the welfare of a person or a group includes health, comfort, and happiness. In the other words, the utility function that depends on the consumption of goods and services represents the preferences of individuals to various packages of goods and services. Socioeconomic status (SES) is an economic and sociological combined total measure of a person’s work experience and of an individual’s or family’s economic and social position in relation to others, based on income, education, and occupation [17] . A cluster of households with its SES and other features has its own utility. Through this paper, utility is the same set of popular itemset by association rules between items. These rules have been discovered by the data mining algorithms. The utility function of the low-income cluster in the 2011 HIES is defined by
The favourite basket of households in this cluster (i.e.
Similarly, the functions
where
It is notable that each cluster has its own rules. Firstly, the clusters of the 2011 and 2016 HIES based on income variables were sorted. Secondly, the purchasing association rules between the low-income cluster of the 2011 and low-income cluster in 2016 were as follows:
The average yearly income of the lowest cluster in 2011 was 58480000 IRR.
Relational database (2021).
Relational database and clusters for the 2021 HIES are found in the exact same way in Fig. 9.
Clusters relationship (2021).
The Cluster Relationship, as shown in Fig. 10, displays each cluster as a single node. The resultant view is a diagram indicating which clusters are similar or dissimilar and the relative strength of these similarities. Hence, there is the strongest relationship between these clusters in Fig. 10.
In the association rule algorithm, by implementing the Apriori algorithm on each cluster, it was principally intended to discover strong and interesting rules.
In this paper, association rule mining, with a confidence of 0.9 and support initiated from 1 and would be decreased by iterations, the strongest first 15 rules were requested from Apriori algorithm of SQL data mining software, for the highest and the lowest income households clusters in 2011 and 2016 respectively. At most, the top 15 strongest rules were drawn from the clusters revealed important products and relationships as follows:
Buying 1 Kg of tomato paste implies that one purchases of 1 Kg of imported tea. Buying 1 Kg of beef and calf implies that one purchases 1 Kg of imported tea. Buying 5 Kg of potatoes implies that one purchases 1 Kg of the egg. Buying 5 Kg of vegetable oil and margarine implies that one purchases 0.5 Kg of imported tea. Buying 1 Kg of lentils implies that one purchases about 0.5 Kg of imported tea. Buying 10 Kg of imported rice implies that one purchases 0.5 Kg of imported tea. Buying 2 Kg of sugar (sugarloaf) implies that one purchases 1 Kg of the egg.
The average yearly income of the lowest cluster in 2016 was 121040000 IRR. The strongest association rules [as mentioned above] of this cluster were as follows:
Buying 1 Kg of cucumber implies that one purchases 1 Kg of onion. Buying 1 Kg of onion implies that one purchases 2 Kg of potatoes. Buying 2 Kg of potatoes implies that one purchases 2 Kg of tomatoes. Buying 1 Kg of cucumber implies that one purchases 1 Kg of vegetables. Buying (45,000 IRR equivalent to 1 Kg) of tomato paste implies that one purchases (60,000 IRR equivalent to 750 grams) of cheese. Buying 3 Kg of potatoes implies that one purchases of 2 Kg of onion. Purchase 2 Kg of yogurt implies that one purchases (40,000 IRR equivalent to 500 grams) of cheese.
Received energy from favorite basket with depended items has been calculated as follows:
and
This means that the amount of received energy from the favorite basket has been significantly reduced for low-income households from 2011 to 2016. In fact, the lowest income cluster in the 2011 HIES, with an average yearly income of about 58480000 IRR, had a larger consumption basket (in terms of quantity and variety of goods and energy) than basket of the lowest-income cluster of the 2016 HIES with average yearly income of about 121040000 IRR. The average income has been increased per year, but the favorite basket has been drastically decreased. This difference indicates that there is no successful planning to reduce poverty.
Outcome comparison of lowest-income from 2011 to 2021
Outcome comparison of lowest-income from 2011 to 2021
Where, in October 2011, 2016 and 2021 the rial fell further to about 18000, 42500 and 250000 rials per USD in the free market, respectively.
Outcome comparison of highest-income from 2011 to 2021
It is worth mentioning that this basket of household consumption with their association rules shows purchasing power. A simple review of data, without statistical treatment, illustrated that this basket has varied and energetic for the lowest income households in 2011 in contrast to 2016. Household consumption pattern shows that purchasing power has been decreased for the lowest income household. Moreover, economists widely use the Consumer Price Index (CPI) to find money real value during a period of time. Indeed, they use CPI as a measurement for the inflation rate. Since, the CPI were 43.2, 108.1 and 350.2 in 2011, 2016 and 2021 respectively, the value of 58,480,000 IRR in 2011 was equivalent to 146,335,370 IRR in 2016, also it was equivalent to 474,067,037 IRR in 2021 which in turns implies that the average yearly income value in 2011 was 1.21 and 1.48 times of average yearly income value in 2016 and 2021 respectively. In fact, the purchasing power of the lowest-income households in 2016 and 2021 were 76% and 39% of their purchasing power in 2011.
On the other hand, the highest income clusters household in 2011 had average yearly income 308,150,000 IRR, but the highest income clusters household in 2016 and 2021 had average yearly income 724,670,000 IRR and 2,013,219,000 IRR. Consumption patterns and purchasing power of these households have been characterized. Details of this comparison are as follows:
The average yearly income of the highest income cluster of 2011 was 308,150,000 IRR. The strongest association rules [as mentioned above] of this cluster are as follows:
Buying 1 Kg of imported tea implies that one purchases 0.5 Kg of butter. Buying 0.5 Kg of butter if and only if one purchases 2 Kg of the egg. Buying 2 Kg of cheese implies that one purchases 1 Kg of imported tea. Buying 2 Kg of potatoes implies that one purchases 1 Kg of lentils.
The average yearly income of the highest income cluster of 2016 was 724,670,000 IRR. The strongest association rules [as mentioned above] of this cluster are as follows:
Buying 1 Kg of pinto beans if and only if one purchases 1 Kg of lentils interconnected. Buying 1 Kg of chickpea implies that one purchases 1 Kg of lentils. Buying 1 Kg of split pea implies that one purchases 1 Kg of lentils. Buying 1 Kg of imported tea implies that one purchases 1 Kg of animal butter. Buying 1 Kg of imported tea if and only if one purchases 2 Kg of the egg. Buying 1 Kg of cheese if and only if one purchases 2 Kg of the egg. Buying 1 Kg of lentils implies that one purchases 2 Kg of the egg. Buying 0.5 Kg of butter if and only if one purchases 1 Kg of cheese. Buying 0.5 Kg of animal butter implies that one purchases 2 Kg of eggs. Buying 2 Kg of tomato paste implies that one purchases 2 Kg of the egg. Buying 2 Kg of lettuce implies that one purchases 300 grams (equivalent to 75,000 IRR) of animal butter. Buying 2 Kg of vegetables implies that one purchases of 2 Kg of the egg. Buying 1 Kg of green pepper and bell pepper implies that one purchases 440 grams (equivalent to 11,000 IRR) of animal butter.
Reviewing the data with no statistical treatment considerations illustrates that the basket of the highest income households in 2016 is greater than the highest income households in 2011. Actually,
and
Tables 1 and 2 shows that CPI was 43.2, 108.1 and 350.2 in 2011, 2016 and 2021 respectively. In fact, the cluster of households in 2011 with average yearly income of 58,480,000 IRR is equivalent to 474,067,037 IRR in 2021, which is also higher than the real average yearly income of the lowest cluster households in 2021 that is 319,783,000 IRR. Moreover, the cluster of households in 2011 with average yearly income of 308,150,000 IRR is equivalent to 2,498,012,269 IRR in 2021, which is also higher than the real average yearly income of the highest income cluster households in 2021 that is 2,013,219,000 IRR, but this difference amongst the highest is lesser than that difference between the lowest from 2011 to 2021. In the other perspective, the purchasing power of the highest-income households in 2021 is
It is considerable how the amount of receiving the energy of favorite basket widens and increases in the highest-income household clusters, and decreases accordingly in the lowest-income ones.
The data mining algorithms provide an important solution for extracting association rules. In this paper, the hidden pattern of consumption has been characterized by using data mining algorithms. It is possible to assign a basket to each household, which reflects the purchasing power of the household. From an economic point of view, for the lowest [highest]-income, purchasing power declined 81 [39] percent, but from the data mining point of view, using the association rules, specifying the basket and the amount of received energy. During the ten years from 2011 to 2021, the amount of energy received for the lowest-income has decreased and reached 11% of the initial value, but for the highest-income, the evaluation showed that it increased and reached 2.43% of the initial value. This means that poor households were highly affected, while wealthy households were not only vulnerable but also able to adapt and even improved their conditions. This paper also revealed that the diet of poor households was much more fragile than that of wealthy households. This means that the (Iranian) politicians/policymakers have not developed appropriate and relevant policies and, they could not fill the gap between the poor and rich.
Abbreviations
SQL: Structured Query Language.
HIES: Household Income and Expenditure Survey.
NSO: National Statistical Offices.
DM: Data Mining.
FCM: Fuzzy c-means.
KDD: knowledge discovery in databases.
Competing interests
The authors declare that they have no competing interests.
Author’s contributions
All mentioned authors contributed in the elaboration of the paper. All authors read and approved the final manuscript.
Availability of data and materials
The microdata for household income and expenditure survey known in Iran and referred in this paper is available online at the government website:
Ethics approval and consent to participate
Not applicable.
Funding
Not applicable.
Footnotes
Acknowledgments
The authors thank the anonymous reviewers for their helpful suggestions and comments.
