Microdata and Economic Analysis: Potential Applications

The Importance of Analyzing Micro Behaviors for Understanding Aggregate Phenomena


Planning Internationalisation Internationalisation tools

Log in to use the pretty print function and embed function.
Aren't you signed up yet? signup!

The use of microdata represents a frontier with enormous potential for economic analysis, as it provides an incredibly rich set of information for an in-depth understanding of the relationships between individual and aggregate behaviors.
Microdata are data collected at the level of individual observation units, such as individuals, families, or companies, and they can provide granular information on various aspects of the observed units, describing their specific behaviors and characteristics.
This methodological advancement not only improves the theoretical robustness of economic analyses but also offers practical tools for formulating more effective and targeted public policies. Therefore, microdata represent a frontier point in a methodological evolution framework with deep-rooted origins.

From Microfounded Models to Microdata

In the field of economic analysis, for many years economists have developed theories to explain observed macroeconomic phenomena. John Maynard Keynes, in his most famous work "The General Theory of Employment, Interest, and Money" (1936), elaborated a theory describing the relationship between income and aggregate consumption. According to this theory, aggregate consumption can be defined by a constant plus a share of a nation's disposable income. If disposable income increases, aggregate consumption will tend to increase, but not in direct proportion: only a part of the additional income will be consumed. Based on the simple observation that people tend to consume only a part of the income increase, Keynes noted that this share is less than one. This relationship between income and consumption is a crucial component of his formulation of aggregate demand theory. However, the economist did not bother to explicitly formalize the relationship between individual agents' behaviors and macro observations, considering his intuition sufficient.

In the following years, this simplification was the subject of much criticism: in the 1970s, many economists concluded that macroeconomic relationships should be rigorously derived from the aggregation of individual agents' microeconomic relationships. This debate on the “microfoundation of macroeconomics” was born and continues tirelessly from the famous Lucas critique (1976). In the last decades of the last century, it became a rule in academia not to consider valid macroeconomic relationships that were not "microfounded." Every economic model had to contain within it the microeconomic behavior of agents, derived from an aggregation process of micro relationships formulated on explicit, clear, and rational theoretical bases. The lack of microdata imposed a verification of hypotheses only at the macro level, implying that both the micro theoretical hypotheses and the aggregation hypotheses were subjected to simultaneous verification. This process was undoubtedly complex, but it was the only possible one before the era of microdata.

With the beginning of the 21st century, the development of computing and increasing digitalization have made growing sets of information available, allowing empirical verification of many economic theories developed over the previous decades.

The Issue of Statistical Confidentiality

As the availability of information for individual observation units has increased, particularly in advanced economies, the importance of protecting the anonymity and confidentiality of sensitive data and individual information has grown.
This is to respect some crucial assumptions related to the collection and systematization of these types of information: firstly, the need to prevent the identification of individuals or entities involved, and secondly, the safeguarding of the integrity and credibility of scientific research.
The need to maintain statistical confidentiality initially limited the use of microdata outside of entities and organizations capable of providing formal confidentiality commitments. However, this constraint risked limiting the use of microdata to a restricted number of entities, preventing broader, collective use and greater knowledge production.
To limit risks and ensure wide dissemination, various methodologies have been implemented to guarantee statistical confidentiality, allowing the maximum information to be extracted from the use of microdata.

To better contextualize this aspect, the following is an examination of the different methodologies used, focusing the analysis on those developed to safeguard the statistical confidentiality of businesses.

Methodologies Used to Ensure Statistical Confidentiality of Businesses

The methodologies developed can be grouped into three main areas:

  • Size Aggregation;
  • Microaggregation;
  • Distributed Microaggregation;
Size Aggregation

Size aggregation is the most common method used by statistical offices to ensure information confidentiality. Records of individual businesses are aggregated by sector of belonging, region of location, number of employees, or other possible data organization dimensions.

A particularly interesting case is the import-export data declared by businesses. In this case, the dimensions considered are four: country of origin, country of destination, tariff code, and month of declaration. Statistical institutes aggregate the elementary data considering all four dimensions simultaneously, obtaining an "aggregated" database related to trade flows originating from a country to specific partners, concerning goods classified according to a customs tariff in a given month of the year.
Before the dissemination of this database, statistical institutes also verify that each cell contains at least three observations (so-called "minimum frequency cell"). Otherwise, the observation will be reported in reference to the higher product code.


Through this methodology, data related to individual observation units are replaced with the average of the cluster to which the observation unit belongs.

Distributed Microaggregation

In this methodology (Bartelsman, Haltiwanger, and Scarpetta, 2004), elementary data are replaced with information relating to their distribution according to the considered dimension. Generally, the distribution is defined in terms of "statistical moments," such as mean, standard deviation, and distribution deciles. This methodology allows the creation of synthetic datasets that replicate the statistical properties of the original data without revealing individual information. An important example of this methodology is the 9th VINTAGE COMPNET DATASET developed by the Competitiveness Research Network, which provides regular updates of data sets on the competitiveness of European countries aggregated through distributed microaggregation methodology.


Microdata are a fundamental resource for investigating and understanding the laws of economics, indispensable for modeling and understanding complex phenomena. Their granularity allows for the verification of not only relationships between macroeconomic variables but also those that theory hypothesizes to hold at the level of individual economic agents. However, the use of microdata requires preserving the statistical confidentiality of elementary observations. This is an essential aspect to protect the privacy of individuals and companies and to ensure legal and ethical compliance in their use. To meet this need and, at the same time, allow scholars to fully exploit the potential of microdata, specific data protection methodologies have been developed.

The knowledge and use of these methodologies are fundamental for those who want to extract the maximum possible content from the growing availability of economic data. By integrating statistical confidentiality with advanced microdata analysis techniques, researchers can obtain detailed and valuable insights. This combination of data protection and advanced analysis allows for fully exploiting the opportunities offered by microdata, significantly contributing to knowledge development and evidence-based policy formulation.