Policy Research Working Paper 11032 Estimating the Number of Firms in Africa Marcio Cruz Florian Moelders Edgar Salgado Ariane Volk International Finance Corporation January 2025 Policy Research Working Paper 11032 Abstract This paper estimates the number of firms in Africa, con- 9 million additional micro-businesses with fewer than five sidering their size and formal status. It relies on a novel workers. Among a total of 2.3 million firms with five or methodology that combines multiple data sources. The more workers, 2 million are formal firms. The proposed results suggest that by 2020 there were 12.7 million firms methodology provides valuable insights to researchers and with more than one worker, and about 240 million own-ac- policymakers by enabling an assessment of the potential count businesses, where the proprietor constituted the sole market size based on firm characteristics in a context of employee. Informality prevails among micro businesses, limited information. totaling 218 million own-account informal businesses and This paper is a product of the International Finance Corporation. It is part of a larger effort by the World Bank Group to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at marciocruz@ifc.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Estimating the Number of Firms in Africa* Marcio Cruz† Florian Moelders ‡ Edgar Salgado§ Ariane Volk¶ JEL Classification: D22, L25, L26 Keywords: Firms, Entrepreneurship, Africa, Private Sector Development, Market Analysis and Segmentation * The authors thank Paolo Mauro, Denis Medvedev, Roberto Fattal, Xavier Cirera, Mark Dutz, Justice Mensah, and participants of the authors’ workshop ”Digital Opportunities in African Businesses,” for valu- able comments. Lucio Castro actively participated in earlier drafts of this paper and we are indebted to him. Lucien Ahouangbe, Kaleb Abreha, and Ghita Chraibi provided superb research assistance. This paper is a product of the IFC Economic Research Unit. Financial support from the Government of Japan is gratefully acknowledged. † International Finance Corporation, email: marciocruz@ifc.org (Corresponding author) ‡ International Finance Corporation, email: fmoelders@ifc.org § International Finance Corporation, email: esalgadochavez@ifc.org ¶ International Finance Corporation, email: avolk@ifc.org 1 Introduction The number of firms operating in Africa and their characteristics remain challenging to ascertain due to the limited and inconsistent availability of establishment censuses and business statistics across the continent. This paper aims to bridge this knowledge gap by mapping existing sources of firm-level data, integrating employer information from household and labor surveys, and ap- plying a simple statistical model to estimate the total number of firms in African countries, cate- gorized by size, sector, and formal status. The scarcity of data on the number of firms in Africa starkly contrasts with the extensive, up-to-date, and comparable firm-level information available in advanced economies. For instance, the OECD’s Structural and Demographic Business Statis- tics and the European Union’s Business Demography data exemplify the comprehensive nature of firm demographic information available for these regions. Previous studies have emphasized the importance of these estimates for Africa (Teal, 2023b,a; Tsaedu et al., 2023), but often focus on a few particular countries for which firm data is available. Firm demographic indicators are crucial for policymakers as they provide insights into the structure and dynamics of the economy, enabling targeted and effective decision-making. These indicators help identify which sectors and firm sizes drive growth and highlight regions or popu- lation groups facing challenges, such as limited access to technology needed to boost productivity. They are also instrumental in making decisions related to taxation, social security, and policies aimed at fostering job creation. Additionally, firm demographics are vital for assessing economic vulnerabilities, designing tailored crisis responses, and evaluating the long-term impact of poli- cies. A recent report by The Economist (2025) suggests that ”Africa has too many businesses, too little business,” emphasizing the lack of large enterprises on the continent as a key constraint to economic growth. Better estimates of the distribution of firms by size and formal status, along with the distribution of workers associated with them, can provide greater clarity on the magni- tude of this gap. Estimates on business demographics are also relevant for investors and the private sector. This information can be used to estimate the potential markets for intermediary inputs and the de- mand for finance, along with population demographics indicators, which are already widely used for these purposes. An example of an application is provided by Cruz, Salgado, and Tran (2024), combining our estimated number of firms by various characteristics across African countries to analyze the potential for digital upgrades by businesses across the continent. Another example of application is the MSME Finance Gap Report (IFC, 2017), which uses estimates on the num- ber of firms to analyze the gap and potential demand for finance by micro, small, and medium enterprises. This paper estimates the total number of firms in all African countries, distinguished by size and informality status. It contributes to the broad research agenda on firms in developing coun- tries by producing a dataset with the basic demography of African businesses. The dataset is the result of applying statistical methods to information from national censuses and surveys as well as data from the International Labor Organization (ILO) and the World Bank Enterprise Survey. First, we estimate the number of African firms at the aggregate and country levels based on em- ployers. Second, we differentiate between formal and informal firms. Lastly, we estimate the size 2 distribution for both formal and informal firms. We then validate our results by comparing our estimates with a few countries where establishment censuses were available. Our methodology and estimates aim to address the limited information available on firm demographics in African countries. The results should be interpreted as a first attempt for these estimates that should be improved over time by adding a larger number of harmonized establishment censuses and repre- sentative surveys for providing more accurate and rich description of firms in the continent. Our results suggest that as of 2020, there were about 12.7 million firms with more than one worker, and about 240 million own-account businesses in which the proprietor constituted the sole employee. Eighty percent of micro businesses are informal, totaling 187 million own-account businesses, and 7.7 million out of 9.4 million micro firms with fewer than five workers, excluding own-account. Among 3.2 million firms with five or more workers, 2.3 million are formal. Distin- guishing by formality status and size is instrumental in identifying firms with growth potential, opposed to those driven by necessity. Our analysis is closely related to various studies emphasizing the relevance of firm charac- teristics to the economy. One important effort worldwide is the dataset assembled by Bento and Restuccia (2021). This dataset captures the average employment size of establishments in the manufacturing and service sectors, regardless of their formality status. Their findings suggest that manufacturing firms tend to be larger than those in the services sector, and each sector is larger in more advanced economies. They provide estimates of employment size for 42 of the 54 African countries. Another relevant study (Eslava et al., 2024) combines information from harmo- nized household and labor surveys with firm-level data to analyze the relationship between firms and inequality in Latin America. Informality and self-employment are widespread across Latin America, but even more prevalent in Africa. They are important in driving income inequality and slowing down technology diffusion (Levy, 2024). This paper also contributes to the broad literature on informality in Africa. Informality varies in several dimensions (e.g., a firm can be formal in some respects and informal in others) and is sparsely captured by few national firm censuses and surveys in the region (Bonnet et al., 2019; Choi et al., 2020). Previous studies have identified key characteristics of informal firms (Loayza, 2016; Porta and Shleifer, 2008), but the literature primarily assesses informality in terms of its contribution to GDP or employment, neglecting the prevalence and significance of informality in micro and small firms (Ohnsorge and Yu, 2022; Quiros-Romero et al., 2021). Informal businesses are also associated with “the missing middle hypothesis” in Africa. Abreha et al. (2022) provide evidence of a missing middle in the size distribution of manufacturing plants in Burkina Faso, Camerron, Ghana, and Rwanda, resulting from the underrepresentation of informal firms. We expand the estimates of number of informal firms by size groups across the continent. The rest of the paper is organized as follows. Section 2 provides an overview of the data and its limitations. Section 3 describes the methodology used to estimate the number of African firms and their characteristics. Section 4 provides the main results. We then describe the main results and some robustness checks. The final section outlines the main conclusions and further research directions. 3 2 Data and the definition of a firm The number of firms can be directly determined using business registries, censuses, and surveys. This often requires that a firm has a fixed physical location associated to the firm. Although this is generally regarded as the gold standard for data on firms, this data is rarely available for devel- oping countries. Firms can also be assessed indirectly via information on employment. Following Coase (1937), a firm can be understood as the space where a relational contract occurs. Specifically, it is where someone, an employer, hires workers to perform a number of tasks. Under this characterization, employers, regularly reported in labor and household surveys, can be assumed as a good approxi- mation of a firm. To put this definition into context, it is worth exploring other definitions. OECD defines a firm as “[...] a legal entity possessing the right to conduct business on its own, for example, to enter into contracts, own property, incur liabilities and establish bank accounts.”1 . The World Bank En- terprise Survey (WBES) defines it as the aggregation of establishments, where the establishment is ”a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments”2 . The World Bank’s Entrepreneurship Database (WBED) provides measures on registered firms across coun- tries, which are defined as formally registered private companies with limited liability. Formal status is an important dimension for the definition of firms in developing countries. OECD’s and WBED’s definitions suggest formality while the WBES’s definition, does not – al- though most of the WBES data focus on formal firms with 5 or more workers. If we aim to un- derstand the broader universe of firms in the Africa context, we also need to incorporate informal businesses, given they represent a large share of the economy. Yet, there are very limited data available across African countries providing comprehensive measures of the formal and informal businesses that remain active.3 An alternative to address the data limitation on firms is by follow- ing the employer perspective. This approach allows the use of micro-level datasets (e.g., labor and household surveys) that are more widely available across Africa. ILO defines employers as “those workers who, working on their own account or with one or a few partners, hold the type of jobs defined as a ’self-employment jobs’ (i.e. jobs where the remu- neration is directly dependent upon the profits derived from the goods and services produced), and, in this capacity, have engaged, on a continuous basis, one or more persons to work for them as employee(s)”. In labor terms, there is another group of self-employed workers conducting activities, the own- account workers. The main distinction between this group and the group of employers is the hir- ing of labor. While both categories represent individuals managing a business, only the employer 1 Source:OECD Data. 2 See for example “Enterprise Survey 2009-2017, Panel Data Liberia, 2009 - 2017”. 3 WBED has recently incorporated information on the total number of registered and closed businesses, in addition to the number of new registrations. Comparison with external datasets suggests that business registry sources tend to provide robust estimations of the number of newly registered businesses, but they have more limitations when referring to the total number of active businesses. 4 employs additional labor. In contrast, own-account workers operate without hiring additional workers Therefore, our primary emphasis lies on the employers category. While we also provide re- sults for own-account individuals engaged in business, our criteria for classifying an entity as a ’firm’ necessitates that the individual not only runs the business but also employs additional labor, thereby acting as an employer. 2.1 Employers, own-account, and employees ILO offers exhaustive employment and employee statistics, based on nationally representative la- bor force surveys combined with household surveys and population censuses4 . ILO data includes the number of employers, employees, and own-account workers by country and year. Specifi- cally, employees who have an explicit or implicit contract of employment with an employer are distinguished from the self-employed, for which remuneration is directly tied to the profits from goods and services that they produce. The self-employed include employers, who continuously have engaged one or more persons to work for them in their business, and own-account workers, who do not employ other workers. The number of employers, own-account and employees by country constitute the key statis- tics from which we estimate the number of firms per size and formality status. ILO reports two versions of these variables. First, ILOSTATS is a data repository with information from labor force and household surveys worldwide. The reporting of employment information is harmonized to allow country comparison, however data gaps remain. Not all countries produce labor or house- hold surveys regularly. This creates space for the second version of employment data, stored in a second repository, ILOEST (ILO, 2023). ILOEST produces estimates of labor force variables using a series of cross-validated statistical models selected based on pseudo-out-of-sample comparisons. In some cases, models are country idiosyncratic. According to ILO (2023), the initial modeling predicts country-level labor force participation rate using economic and demographic variables across nine estimation groups split based on eco- nomic and geographical proximity. Relevant to our setting, ILO further models the distribution of employment by status. In a similar procedure, first, the shares of each category (employer, employee, own account) are modeled against a set of country characteristics such as per capita in- come, economic structure, and model-specific variables such as the ”work for an employer” index from the Gallup World Poll. Second, the models estimate the evolution of these employment cate- gory shares based on the economic cycle and population demographics. The application of these shares on the estimated number of people employed gives the estimated number of individuals engaged in each employment category, which is the main input we use. Among other limitations of these data to estimate the number of businesses by country is that it does not provide informa- tion on firm characteristics, such as size distribution, which is available in firm-level establishment censuses and representative surveys. 4 See table A1 in the Appendix for the underlying data sources used by ILO. 5 2.2 Establishment censuses, economic censuses, and surveys Economic censuses are statistical surveys conducted on the full set of economic units belonging to a given population. In the US, the Economic Census serves as the statistical benchmark for current economic activity, feeding into the calculation of GDP and the Producer Price Index (PPI). The Census provides information on employment, payroll, revenue, sales, and industry classification. Economic censuses are also commonly used to update business registries. Establishment census data collected by national registries, statistical offices, and ministries are the gold standard for information on establishments and firms and serve as a sample frame for firm surveys. They provide information on the number of firms by size, age, sector, and often, formality status. Most African countries carry out such censuses with considerable time gaps between rounds, because of the time and effort required to survey the entire universe of establish- ments. While some form of economic censuses exists for Botswana, Burkina Faso, Cameroon, the Democratic Republic of Congo, the Arab Republic of Egypt, The Gambia, Ghana, Mauritius, Namibia, Rwanda, Sierra Leona. Tunisia, Senegal, and Uganda, it should be noted that access to most of this census information is constrained or aggregated in statistical reports5 . Comprehen- sive micro-data was available only for Kenya, Burkina Faso, Cameron, Rwanda, and Ghana. Although census data are available for some countries, coverage on economic censuses or firm registries varies considerably, compromising harmonization of the collected data across countries. For example, among the available micro-datasets, Cameroon’s registry excludes the taxi industry, while Ghana’s registry includes modern agriculture, albeit it states that the agriculture sector is not covered. In general, most of these registries and censuses focus on the non-agriculture sector, but to varying degrees. We extract information about firm characteristics from World Bank Enterprise Surveys (WBES). WBES are firm-level surveys aiming to provide data that are comparable across countries for a representative sample of firms with 5 or more workers. The surveys cover a broad range of busi- ness environment topics, including access to finance, corruption, infrastructure, competition, and performance measures. WBES provides a comprehensive cross-country database for 194,000 man- ufacturing and services firms in 155 countries. The surveys, however, only include establishments with more than five workers, and therefore exclude micro-enterprises, and, in most countries, informal firms. This focus on small to large formal firms might lead to a potential bias in the representation of the demographics of the firm in Africa (Li and Rama, 2015). Another source of data on firms is the World Bank Entrepreneurship Database (WBED), which was not used in this analysis. WBED provides information on the number of private formal com- panies with limited liability for developing and developed economies on a yearly basis. This data would, hence, form a lower threshold for the number of firms in these countries. The database also provides information on firm entry and exit for selected economies derived from national busi- ness registries. A comparison between WBED and OECD’s Structural and Demographic Business 5 Aggregate numbers also prevent breaking down by size category because of idiosyncratic thresholds used in every context. Harmonization was attempted but the proposed analysis ultimately required micro data, restricting further the number of countries with census micro data at our disposal. 6 Statistics database6 , where available for both, reveals that firm exit may not be accurately captured because it requires firms to file the legal closing of the establishment with government entities to be included in the number of exiting firms. This might cause an upward bias in the number of formal firms, the extent of which is unclear. WBED also lacks information about informal or unregistered establishments. For these reasons, we exclude the WBED from this analysis. We collect information on firm demography from censuses and surveys for more than 25 African countries from all regions and income levels between 2010 and 2020. We extract infor- mation on firm size based on the number of employees, which forms the distinction between micro, small, medium, and large firms. Further, where available, we add information on firms’ formality status - whether a firm is registered or at least licensed in the country or province, as well as sector. We also identify where national censuses and surveys do not cover key sectors (e.g. agricul- ture), or explicitly exclude informal or household-based firms. We discard outdated censuses and surveys collected before 2010 as well as censuses and surveys that lack information about how data was collected. We restrict ourselves to censuses and surveys up to the year 2020, as evidence shows that the COVID-19 pandemic caused much volatility in firm entry and exit, and potentially large fluctuations in the number of firms (Decker and Haltiwanger, 2022). Evidence from rapid surveys suggests that up to 90 percent of firms in selected Sub-Saharan African countries closed temporarily (Aga and Maemir, 2021). Many firms had to lay off workers in the short term only, with consequences for a firm’s employee count. While recent surveys and censuses may give a more up-to-date assessment of the number and characteristics of firms, they may also introduce comparability issues with censuses and surveys collected pre-pandemic. Collected data varies across countries in the definitions of size strata, and coverage of large and informal firms and sectors. As mentioned above, this compromises the harmonization of the datasets. We find meaningful differences, in particular, in how countries define a micro- enterprise. For instance, only entrepreneurs with no employees are considered micro-enterprises in the Gambia, but in three countries, establishments with up to ten employees are identified as micro-enterprises. Small firms are defined as anywhere from 1-4 employees to 10-100 employees (Morocco). Few countries follow the standardized WBES definitions of firm size: 5 to 19 (small), 20 to 99 (medium), and 100 or more (large). In the absence of available micro-data for censuses and surveys, this heterogeneity, unfortunately, cannot be resolved. Further, the definition of informality (if included in the documentation) differs widely across countries: Some countries inherently include activities of households as employers as informal activities; others classify any micro-enterprise as informal, independent of firm registration. Some countries draw the line between formal and informal firms according to firm registration, firm li- censing, or both. Other surveys and censuses explicitly exclude businesses without fixed premises from the stock of businesses, and others classify itinerant traders, taxis, or market stands as infor- mal solely based on the lack of fixed premises. Where possible, we avoid relying on countries’ definition of formality, but instead extract information on the number of registered firms, and the number of licensed firms as a second-best measure of formal firms. 6 Source: https://www.oecd.org/sdd/business-stats/structuralanddemographicbusinessstatisticssdbsoecd.htm 7 3 Methodology This section describes a methodology to calculate the number of firms by country and the subse- quent distinction by size group and formality status. Because harmonization across establishment censuses or registries is not currently feasible or available, based on ILO data, we adopt an ap- proach that favors having the same definition of a firm from the perspective of the employer across the whole economy and across countries. We then combine this information with the moments of the distribution of firms across sizes from other data sources. 3.1 Estimating the total number of firms The simplest calculation of the total number of firms is the sum of people reporting to be em- ployers. This is facilitated by the availability of data on employers, employees, and own-account workers provided by ILO (2023). The latest available data point is for 2020 and indicates that in the whole African continent, there are about 12.7 million employers and 238 million own-account individuals. Geographically, this follows population patterns (see figure 1.a). Regarding density, North and South Africa are the regions with less firm density, while East African countries have higher density (see figure 1.b). 8 Figure 1: Number of Firms and Firms per 1,000 working-age people in Africa in 2020 Source: Own calculations using ILO modelled data on the number of employers.These maps were produced by the Car- tography Unit of the World Bank Group. The boundaries, colors, denominations and any other information shown on this map do not imply, on the part of the World Bank Group, any judgment on the legal status of any territory, or any endorsement or acceptance of such boundaries. 9 3.2 Breakdown by size To break down the number of firms by employment size we develop a method that leverages complementary information from the WBES and census microdata from Ghana, Burkina Faso, Cameroon, and Rwanda. We also use a representative survey for Kenya. These data sources are combined in a framework that links the proportion of different firm sizes to key moments of the firm size distribution. Using the five-worker threshold for micro firms, as per WBES size classification, figure 2 sug- gests that the proportion of micro firms will vary depending on the shape of the distribution. The Cameroon economy will have more micro firms (88%) than Ghana (80%) but less than Rwanda (93%). It is natural, then, that Ghana has a larger share of small firms (those employing between 5 and 20 workers), and that is reflected in the larger grey shaded area between the reference point 5 and 20 in figure 2. In general, a firm size distribution looking more like Ghana will yield higher proportions of small firms than a firm size distribution that looks more like Rwanda. Figure 2: Firm Size Distribution and Size Brackets: from establishment census data Firm Size Distribution and Size Brackets Burkina Faso (2016) Cameroon (2009) Ghana (2014) .8 .8 .8 .6 .6 .6 Fraction of firms Fraction of firms Fraction of firms .4 .4 .4 .2 .2 .2 0 0 0 5 20 100 5 20 100 5 20 100 Average firm size Average firm size Average firm size Rwanda (2014) Rwanda (2017) Rwanda (2020) .8 .8 .8 .6 .6 .6 Fraction of firms Fraction of firms Fraction of firms .4 .4 .4 .2 .2 .2 0 0 0 5 20 100 5 20 100 5 20 100 Average firm size Average firm size Average firm size All firms Informal firms Source: Own calculations based on census data from Burkina Faso (2016), Cameroon (2009), Ghana (2014) and Rwanda (2014, 2017 and 2020). The x-axis reports log firm size to ease visibility. Informality definition according to each country’s criteria. Red lines are size thresholds of 5, 20, and 100 workers. 10 Since data on the whole firm size distribution is scarce, our strategy first defines an statistical model that relates each size share on key moments of the firm size distribution. The idea is to get an empirical source to predict the share of micro, small, medium and large firms based on just a few moments of the firm size distribution, reducing notably the data requirements. The strategy also helps with harmonization of the resulting dataset. It is not uncommon that countries have their own size shares, corresponding to idiosyncratic interests. While that could be relevant for the country itself, it prevents comparability. Our approach defines one set of thresh- olds that is applicable to all countries. To further leverage information from the WBES, we define the same size thresholds at 5, 20 and 100 employees. This also follows the suggestion in Li and Rama (2015) that correcting the biases arising from the exclusion or under-representation of micro- and small firms in enterprise surveys is necessary, using household and labor force surveys to re-calibrate the firm size distribution. As stated above, the first goal is to map the relationship between the shape of the distribution and each size proportion through these four moments. Ideally, we would observe several firm size distributions of different shapes that will inform a statistical model that connects size shares and the shape of the firm size distribution. However, we have only 5 countries with microdata that allow this: Burkina Faso (2016), Cameroon (2009), Ghana (2014) and Rwanda (2014, 2017 and 2020) and Kenya (2017)7 . To increase the number of distributions, we can observe, for which we can derive key moments and firm size shares, we break each country’s microdata into age groups from 0 to 61 to create more distributions with different shapes. This leaves us with 62 firm size distributions by country, totaling 434 distributions from which we can extract size shares and information on the shape of the distribution. We compute the four size proportions from each firm size distribution and the mean, p25, p50, p75, and p90. We have 434 observations to elucidate their relationship to the firms’ size shape accounted for by these four moments. In practice, this implies estimating8 the following regression for each of the four size proportions: zki ln ¯i + β 2 p25i + β 3 p50i + β 4 p75i + β 5 p90i + ε i = β0 + β1 e (1) (1 − zki ) where zki is the proportion k = {micro, small , medium, large} coming from the firm size distri- bution of all firms aged i = {0, ..., 61}. The algorithm uses the estimated βs to predict all remaining countries based on the four moments we reconstruct in the following step. It is important to estimate equation 1 using the information of several distributions of different shapes so the βs carry the precise connection between the moments and the size shares. Figure 3 shows the correlation between the actual size shares and the predicted ones in every age group. The model produces accurate estimates of the size shares based on these five moments, mean, p25, p50, p75, and p90. The next step consists of obtaining the five moments for every country so we can use the βs from equation 1 to predict the size shares. For this step, we rely on ILO and WBES data. With 7 Results of this section, and the overall estimations are robust to excluding Kenya survey data and using only census data. Results available upon request. 8 STATA implements the maximum likelihood estimator developed by Papke and Wooldridge (1996). 11 Figure 3: Real and Predicted Size Shares Source: Own calculations based on projections from equation 1 and establishment census data from Burkina Faso (2015), Cameroon (2008), Ghana (2013) and Rwanda (2014, 2017 and 2020); and survey data from Kenya (2017). ILO data, it is possible to recover at least the nationwide average firm size for every country because this dataset provides information on the number of employers, own-account workers, and employees. As our framework equates employers with firms, the average firm size can be approximated using this information combined with the number of people working as employees, employees also reported by ILO. The average firm size by country is then e = employers . Using Ghana as an example we can explore the strategy. The average Ghanaian firm size estimated for 2013 is 4.3 workers per firm (i.e., the ratio of employees over employers), while the establishment census of the same year reports an average firm size of 5.2. The WBES surveyed Ghana in 2007 and 2013, and the pooled average firm size in this dataset is 33.3 workers per firm, roughly 6 times higher than the average firm size using either the approach with ILO data or averaging across the establishment census. The WBES dataset provides the shape of the firm size distribution to be used as a proxy in the context of limited data availability, but the average firm size is overestimated. Figure 4 compares the 2013 firm size distribution in Ghana using the establishment census and the WBES. Clearly, the firm size distribution plotted with the WBES lies to the right of the firm size distribution plotted using the establishment census. This is the difference between 33.3 workers per firm (WBES) and 5.2 workers per firm (establishment census). Thus leveraging on the average firm size distribution estimated with ILO, we re-scale the mean, p25, p50, p75, and p90 of the firm size distribution found in the WBES by the ratio between the average firm size estimated with ILO data and the average firm size estimated with WBES data. In practice, we “shift to the left” the WBES firm size 12 Figure 4: Two sources for firm size distribution in Ghana in 2013 Source: Own calculations using Establishment census and WBES. distribution aiming to correct the bias we observed in the data. 3.3 Breakdown by formality The next step in the design of the algorithm is the breakdown by formality status. ILO provides modeled data on the number of employers, own-account workers, and employees by country for every year from 2000 to 2020, which we have used in the previous calculations. However, information on the breakdown of employers, own-account workers, and employees working in formal or informal firms is available only for a number of years and for fewer countries since reporting on firms’ informality is not universal. Our approach assumes a constant share of informality within each worker type. To implement this, we use the median informality share within each worker type by country. Applying this share to the modeled data yields the number of employers, own-account workers, and employees, both formal and informal, for every country with data on firms’ formality status between 2000 and 2020. These numbers are used to calculate the average firm size for both formal and informal firms using the ratio between employees and employers. A further re-scaling is required after this. The simple approach would be to distinguish be- tween formal and informal averages of firm size and apply the same re-scaling as above. How- ever, the firm size distribution within informal firms may not only be shifted to the left compared to the distribution of formal firms but have a higher concentration on the left tail due to the large prevalence of micro firms among informal firms as shown in figure 2: informal Ghanaian firms 13 are more concentrated in the first size group. To correct this difference in the firm size distribu- tion for formal and informal, we exploit the availability of formal and informal distinctions in the five micro-datasets to further re-scale the four percentiles from the firm size distribution of formal firms. This is done in two steps. First, we calculate the five moments for formal firms as described above: based on the average firm size among formal employers, own-account workers, and employees, the other three per- centiles are calculated by imposing the relative position of each percentile to the mean as found in the WBES9 . Second, we produce the same calculation for the mean and percentiles of informal firms based on the average firm size of informal firms and then re-scale the percentiles of the informal distri- bution concerning the same percentiles of the formal size distribution. For this second re-scaling, we use microdata from Burkina Faso, Cameroon, Ghana, Rwanda and Kenya to extract the mean and four percentiles for both formal and informal and obtain the relative size of each percentile from the informal firm size distribution with respect to the same percentile from the formal firm size distribution. For instance, the p25 of the formal firms’ size distribution is 2 in Ghana, and 3 in Kenya, while the same percentile among informal firms is 1 both in Ghana and Kenya. This means that the p25 of the informal firms in Ghana is 1/2 of the p25 of formal firms and 1/3 in Kenya. We produce this calculation also for percentiles p50, p75, and p90 and keep the mean across the five micro-datasets. In the example, this means using 1/3 as the adjustment scalar for the p25 of informal firms. The method we followed has several limitations. First, estimates of the number of firms using employer statistics might not be consistent with estimates of the number of firms from establish- ment censuses. Second, we are using only five countries and thus imposing the shape of the firm size distribution based on limited information. In further revisions, we expect to include more countries and characterize the differences in the percentiles between formal and informal. 3.4 Number of workers by size and formality Dividing the number of employees by the number of firms, it is possible to estimate the average firm size in the economy (26 workers per firm in Africa, 33 in the rest of the world). For all the countries where this calculation is possible, the elasticity to income measured with GDP per capita is 0.13. Bento and Restuccia (2021) estimate an elasticity of 0.3 for an average firm size that includes all people involved in business activities (but only manufacturing and services). In our setting, this would be equivalent to redefining people working for a business as the sum of employees and own-account, and also redefining firms as the sum of employers and own-account. This second definition yields and average firm size of 1.9 workers for the African continent and 7.7 elsewhere. The elasticity to income is 0.48, higher than the one initially estimated, and conceptually closer to the one estimated by Bento and Restuccia (2021). We employ this framework to impute average firm size by size group and formality status 9 For countries with no WBES data we use the average relative positions across countries covered by WBES. 14 using a model that links firm size in any of these groups to log GDP per capita, log GDP, firm informality rate, the overall average of firm size, in logs: the number of own-account, employees and micro firms. The number of people working in each category of size and formality is then calculated as the result of multiplying the estimated firm size by the number of firms. To ensure consistency, the number of workers for the category of large firms, individually for both formal and informal, is calculated as the difference between the total number of workers in the formal or informal markets minus the estimated number of workers obtained by the previous multiplication. 4 Results 4.1 Total number of firms and share of workers associated with them This paper estimates that there were 244 million businesses on the African continent in 2020. The large majority of these (231 million) were own-account businesses, while the remaining 12.7 mil- lion businesses were firms that employ workers. Table 1 shows the breakdown by size. Excluding own-account businesses, there are 10.5 mil- lion micro firms in Africa in 2020. This represents 83% of all firms in the continent, and they absorb 20% of all labor force, less than 1/3 of the labor share of own-account businesses. Small firms account for 1.5 million businesses, constitute 12% of all firms, and hire 3% of the labor force. In contrast, there are approximately 102,000 large firms which absorb almost twice as much labor as the collective of small firms, 7%. It is worth noting that employers, as per our definition of a firm, constitute 36% of the labor force. Table 1: Estimated number of firms (in thousands) and labor absortion (1) (2) Number of Firms Labor Share (%) Own-account 231,689 64 Micro (less than 5) 10,516 20 Small (5-19) 1,520 3 Medium (20-100) 603 6 Large (100+) 102 7 Total 244,429 Source: Own calculations Table 2 shows the estimated number of firms by employment size by country. The number of firms increases with population. The correlation between the log number of firms and the log 15 population is 0.9, while the correlation between the log number of own-account businesses and the log population is 0.91. Egypt, Nigeria and South Africa are the countries with the largest number of firms. The combined number of firms in these three countries alone constitutes 36% of all firms in Africa. This is higher than the population share of these three countries, 26%. By size group, the correlation with population is higher for the number of micro firms, 0.8, and lowest for the number of large firms, 0.7. In fact, the correlation between the share of micro firms and population is 0.34, while the relationship between large firms and population is -0.18. This suggests a larger prevalence of micro firms in larger countries on the African continent. 4.2 By formality status Using the methodology described above, we break down the number of firms by formality status. Table 3 suggests that 9,324 thousand firms operate in the informal markets. This represents an informality rate of 73% . Although high, the informality rate among firms is still lower than the informality rate estimated for own-account businesses, where 94% of the 231 million own-account units operate informally. By size, firm informality is driven by micro firms, where 87% of firms in this size category are informal. Informality among small firms is estimated to be 14%, while 2% is the rate of informality among medium-sized firms and 1% among large firms. 16 Table 2: Estimated number of firms in 2020 (thousands) Micro Small Medium Large Total Own-account Total+Own-account Population AGO 525.3 10.6 27.9 1.8 565.6 7,051.1 7,616.7 33,428.5 BDI 52.9 5.8 1.5 0.3 60.4 2,862.8 2,923.2 12,220.2 BEN 57.6 3.0 1.9 0.3 62.8 3,318.2 3,381.0 12,643.1 BFA 36.8 5.0 2.9 0.2 44.9 3,620.0 3,664.9 21,522.6 BWA 8.7 6.6 2.8 0.9 19.1 197.4 216.5 2,546.4 CAF 12.4 2.3 0.6 0.1 15.4 1,043.0 1,058.4 5,343.0 CIV 126.2 16.6 7.0 1.6 151.4 5,834.3 5,985.7 26,811.8 CMR 245.4 39.2 12.2 3.9 300.8 6,339.6 6,640.3 26,491.1 COD 628.4 39.0 39.3 3.5 710.2 19,287.7 19,998.0 92,853.2 COG 12.9 1.5 2.4 0.1 16.9 1,242.5 1,259.4 5,702.2 COM 1.2 0.3 0.3 0.0 1.9 107.4 109.2 806.2 CPV 4.5 2.9 0.8 0.2 8.2 50.7 59.0 582.6 DJI 3.9 2.2 0.3 0.1 6.6 78.1 84.7 1,090.2 DZA 292.8 117.5 42.9 9.7 462.9 2,568.1 3,031.0 43,451.7 EGY 2,009.9 293.0 94.3 20.0 2,417.2 3,338.2 5,755.4 107,465.1 ERI 10.8 6.1 1.8 0.2 18.9 847.5 866.4 3,555.9 ETH 140.3 64.4 43.7 10.4 258.9 26,581.9 26,840.8 117,190.9 GAB 7.3 3.4 1.7 0.6 13.0 162.4 175.4 2,292.6 GHA 616.9 53.6 11.7 2.0 684.2 8,200.8 8,885.0 32,180.4 GIN 54.8 6.1 0.7 0.2 61.7 2,516.1 2,577.8 13,205.2 GMB 8.4 4.1 1.2 0.2 13.9 519.0 532.9 2,574.0 GNB 5.9 0.6 0.3 0.0 6.8 362.4 369.2 2,015.8 GNQ 40.1 0.5 0.3 0.0 40.9 351.2 392.2 1,596.0 KEN 271.0 29.2 22.4 1.1 323.8 7,835.4 8,159.2 51,985.8 LBR 25.6 3.2 3.9 0.3 33.0 1,544.2 1,577.2 5,087.6 LBY 32.9 6.8 7.6 0.8 48.0 452.4 500.4 6,653.9 LSO 18.2 2.9 0.4 0.3 21.8 313.4 335.2 2,254.1 MAR 153.6 52.2 26.9 5.6 238.4 3,219.1 3,457.5 36,688.8 MDG 605.5 26.2 4.2 1.3 637.2 6,570.9 7,208.1 28,225.2 MLI 28.7 8.2 5.7 0.8 43.4 3,934.5 3,977.9 21,224.0 MOZ 299.3 12.1 8.0 2.2 321.5 8,368.3 8,689.8 31,178.2 MRT 30.3 3.6 1.8 0.3 35.9 436.0 472.0 4,498.6 MUS 12.0 2.7 2.0 0.6 17.3 80.9 98.2 1,266.0 MWI 65.3 25.6 5.7 1.7 98.3 3,855.0 3,953.2 19,377.1 NAM 33.3 10.4 1.7 0.5 46.0 201.6 247.6 2,489.1 NER 58.6 6.0 1.7 0.4 66.7 5,649.7 5,716.3 24,333.6 NGA 1,148.0 30.8 63.9 2.0 1,244.7 42,340.1 43,584.8 208,327.4 RWA 8.5 24.0 8.0 1.2 41.8 1,617.4 1,659.2 13,146.4 SDN 320.1 126.7 34.3 4.9 486.1 3,869.2 4,355.3 44,440.5 SEN 70.6 10.2 3.4 1.7 85.9 2,411.1 2,497.1 16,436.1 SLE 65.6 6.2 0.4 0.1 72.3 2,091.8 2,164.1 8,234.0 SOM 19.0 5.6 0.8 0.2 25.6 1,451.6 1,477.2 16,537.0 SSD 36.9 8.1 2.2 0.3 47.4 1,635.3 1,682.7 10,606.2 STP 0.8 0.3 0.1 0.0 1.3 28.1 29.3 218.6 SWZ 2.3 2.1 1.1 0.2 5.7 87.4 93.1 1,180.7 TCD 56.2 4.4 1.6 0.3 62.5 3,617.1 3,679.6 16,644.7 TGO 35.8 5.2 2.4 0.7 44.2 1,950.7 1,994.8 8,442.6 TUN 131.7 32.5 13.8 3.3 181.3 484.8 666.1 12,161.7 TZA 733.9 76.9 10.3 1.3 822.3 13,603.0 14,425.4 61,704.5 UGA 619.7 40.5 3.5 0.4 664.0 9,691.7 10,355.7 44,404.6 ZAF 719.1 262.2 53.7 12.4 1,047.4 1,732.7 2,780.1 58,801.9 ZMB 5.9 4.5 6.2 0.5 17.0 2,897.4 2,914.4 18,927.7 ZWE 4.0 6.3 6.7 0.3 17.3 3,237.1 3,254.4 15,669.7 Total 10,515.6 1,520.1 603.1 101.7 12,740.6 231,688.8 244,429.4 1358715.0 Source: Own calculations 17 Table 3: Estimated number of formal and informal firms (thousands) (1) (2) Formal Informal Micro (less than 5) 1,361 9,155 Small (5-19) 1,377 143 Medium (20-100) 577 26 Large (100+) 101 0 Total 3,416 9,324 Own-account 13,355 218,334 Total + Own-account 16,771 227,658 Source: Own calculations Our definition of firms absorbs 36% of the labor force in the African continent. Labor force, in this case, involves anyone engaged in labor that is not an employer. This definition, thus, is the sum of individuals working as employees and individuals engaged in own-account activities. We estimate that 64% of the labor force is employed in own-account activities, of which the large majority, 60.2% works informally. In addition, 21% is employed by formal firms and 15% by informal firms. Most of those employed by informal firms, 14.8%, are working in micro businesses. The labor share across firm size in the formal market is less concentrated; however, it exhibits the well-known regularity of the missing middle found with other data sources as in Abreha et al. (2022). While formal micro-firms absorb 5.4% of the labor force, small firms are responsible for 3.3%. Medium-sized firms, on the other hand, account for 5.8% of workers, slightly more than micro-firms. The group of large firms absorbs the largest share of labor within the formal sector, with 6.4% of the labor force employed by these firms. Table 4: Labor absortion (1) (2) Formal Informal Micro (less than 5) 4.9 14.8 Small (5-19) 3.3 0.2 Medium (20-100) 5.7 0.1 Large (100+) 7.1 0.0 Total 21.0 15.1 Own-account 3.7 60.2 Total + Own-account 24.7 75.3 Source: Own calculations 18 4.3 Firm density The distinction between own-account business activities and those carried out by employers has implications for our understanding of a firm and its association with development. One way to see this is by exploring business density across regions. Figure 5 explores the correlation of firm density by income for different types of firms across African countries. The density of own- account businesses is higher among poor countries, while the density of larger and formal firms is highly correlated with GDP per capita for various definitions of formal firms (panels b, c and d of figure 5). Taken together, these results suggest that firm density in the African continent is low unless we adopt a less strict definition that includes individuals engaged in own-account activities. Put differently, the density of firms associated with higher formality rates and greater absorption of formal employment is low in Africa. Figure 5: Density of Firms and Gross Domestic Product per Capita, 2020 Source: Own calculations. 4.4 Comparison with census data To validate our results we compare collected census data with the estimates produced by the methodology applied in this paper for the respective year. The countries with available micro data were Burkina Faso (2016), Cameroon (2009), Ghana (2014) and Rwanda (2014, 2017 and 2020). 19 While censuses intend to capture the whole universe of non-agriculture firms in a country, they generally narrow the scope by focusing on establishments with fixed premises. This, by design, excludes, for example, hawkers, street vendors, taxis, roads, and building construction sites, many of which might be micro firms. With our definition of a firm as an employer, we expect the estimated number of firms to be larger than the reported number of firms in each census. Figure 6 shows the correlation between the algorithm estimates and census number for the log number of firms in the formal sector by employment size. The 45-degree line indicates percent correspondence between the estimates and the census. In general, our estimates are larger than the numbers reported in the census in almost every country-year comparison. The correlations between these numbers across country-years observations is highly positive, however. Figure 6: Mapping between algorithm estimates and establishment census data Source: Own calculations. Figure 7 explores the differences in size shares within each dataset. Shares are calculated with respect to the total number of firms contained in each dataset. The largest difference in shares between the algorithm and the census data is in the group of micro-firms. Overall, the algorithm underestimates the share of micro firms while it overestimates the share of small and medium firms. The underestimation in the share of micro firms is particularly preeminent for the informal sector. This means that the shares of firms in the formal sector are overestimated by the algorithm, compared with collected census and survey data for these countries.. 20 Figure 7: Differences in size shares between algorithm and establishment census data Source: Own calculations. 5 Conclusion The methodology presented in this paper offers a valuable tool for estimating the number of for- mal and informal firms in Africa. By utilizing data from multiple sources and employing statistical techniques to address under-reporting and other biases, we have developed a more precise and comprehensive estimate of the number of firms in Africa than was previously available. An ex- tension of this methodology can be potentially used in micro-simulations to analyze the impact of various policies and interventions on the business environment. Utilizing this approach, we estimate that as of 2020, there were about 244 million businesses in Africa, with a significant majority being micro or own-account enterprises. Among these, only 3.4 million firms were formal and employed five or more individuals. The remaining 240 million firms were micro-sized (with fewer than five employees), and 231 million were own-account enter- prises, where the proprietor constituted the sole employee. The proposed methodology provides valuable insights to researchers and policymakers by enabling the assessment of the market po- tential, such as for technology adoption based on firm characteristics as described in Cruz (2024). It is also worth noting that our methodology is not confined to the context of Africa. It can be adapted for various regions, offering policymakers and researchers a valuable tool to understand the dynamics of the informal sector in different settings. Such insights are essential for devising strategies that encourage formalization and foster economic growth. This paper describes an initial attempt to estimate measures of firm demographics in Africa. The methodology combines several data sources under strong assumptions to overcome the chal- lenges of limited information on business demographics in the region. This is an important avenue for future research and filling the data gap worldwide, particularly in developing countries. In- 21 creasing the availability of harmonized establishment and firm-level censuses, along with robust representative surveys for Africa, will further validate and improve these estimates. It would also allow further disaggregation of the indicators and better understanding of the business dynamics in the continent. Three key steps are needed to advance this agenda 1. Improve the assessment and availability of existing data; 2. Invest in ex-ante and ex-post harmonization of firm-level data globally; 3. Enhance the capacity of National Statistical Agencies to generate and provide business statistics. We plan to continue this effort to improve the availability and quality of harmonized firm-level data across the African continent and globally. 22 References A BREHA , K. G., X. C IRERA , E. A. R. D AVIES , R. N. FATTAL J AEF, AND H. B. M AEMIR (2022): “Deconstructing the Missing Middle : Informality and Growth of Firms in Sub-Saharan Africa,” Policy Research Working Paper Series 10233, The World Bank. A GA , G. AND H. M AEMIR (2021): “COVID-19 and African Firms,” . B ENTO , P. AND D. R ESTUCCIA (2021): “On average establishment size across sectors and coun- tries,” Journal of Monetary Economics, 117, 220–242. B ONNET, F., J. VANEK , AND M. C HEN (2019): “Women and men in the informal economy: A statistical brief,” International Labour Office, Geneva, 20. C HOI , J., M. A. D UTZ , AND Z. U SMAN (2020): The future of work in Africa: Harnessing the potential of digital technologies for all, World Bank Publications. C OASE , R. H. (1937): “The Nature of the Firm,” Economica, 4, 386–405. C RUZ , M ARCIO , E . (2024): Digital Opportunities in African Businesses, International Finance Corpo- ration. The World Bank Group. C RUZ , M., E. S ALGADO , AND T. T RAN (2024): Economywide Effects of Digitalization. Digital Oppor- tunities in African Businesses, International Finance Corporation. The World Bank Group. D ECKER , R. A. AND J. H ALTIWANGER (2022): “Business entry and exit in the COVID-19 pan- demic: A preliminary look at official data,” . E CONOMIST, T. (2025): Africa has too many businesses, too little businessa, https://www.economist.com/special-report/2025/01/06/africa-has-too-many-businesses- too-little-business. E SLAVA , M., M. M EL E ´ NDEZ , G. U LYSSEA , N. U RDANETA , AND I. F LORES (2024): “Firms and inequality in Latin America,” . IFC (2017): MSME Finance Gap Report, International Finance Corporation. The World Bank Group. ILO (2023): “ILO modelled estimates database, ILOSTAT,” Accessed 05-02-2023. L EVY, S. (2024): New Technologies, Productivity, and Inequality in Latin America, International Finance Corporation. The World Bank Group. L I , Y. AND M. R AMA (2015): “Firm dynamics, productivity growth, and job creation in developing countries: The role of micro-and small enterprises,” The World Bank Research Observer, 30, 3–38. L OAYZA , N. V. (2016): “Informality in the Process of Development and Growth,” The World Econ- omy, 39, 1856–1916. 23 O HNSORGE , F. AND S. Y U (2022): The long shadow of informality: Challenges and policies, World Bank Publications. PAPKE , L. E. AND J. M. W OOLDRIDGE (1996): “Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates,” Journal of Applied Economet- rics, 11, 619–632. P ORTA , R. L. AND A. S HLEIFER (2008): “The Unofficial Economy and Economic Development,” Brookings Papers on Economic Activity, 2008, 275–352. Q UIROS -R OMERO , G., T. F. A LEXANDER , AND J. R IBARSKY (2021): “Measuring the Informal Economy,” Policy Papers, 2021. T EAL , F. (2023a): “Firm size, employment and value added in african manufacturing firms: Why ghana needs its 1%,” Journal of African Economies, 32, 118–136. ——— (2023b): “What Explains the Firm Size Distribution in Sub-Saharan Africa and Why Does It Matter?” Journal of African Economies, 32, 111–117. T SAEDU , K. G., Z. C HEN , AND H. W. A ZMETE (2023): “Firm Size Distribution in African Manu- facturing Firms: Revisiting the ‘Missing Middle’,” Journal of African Economies, 32, 137–150. 24 A Appendix Table A1: Sources of ILO micro-level data ISO Source PC - Recensements de Population AGO LFS - Employment Survey BEN HIES - Enquˆete de Suivi de l’Enquˆ ete Modulaire et Int´ ee egr´ sur les Conditions de Vie des M´ enages BWA HS - Multi-Topic Household Survey HIES - Enquˆ ete Multisectorielle Continue BFA LFS - Enquˆete R´ egr´ egionale Int´ ee sur l’Emploi et le Secteur Informel BDI HIES - Living Standards Measurement Survey CMR HS - Household Survey CPV LFS - Continuous Multi-Objective Survey Employment and Labor Market Statistics TCD HIES - Enquˆete Modulaire et Int´ ee sur les Conditions de egr´ Vie des M´enages COM LFS - Enquˆete Nationale sur l’Emploi et le Secteur Informel LFS - Enquˆ ete sur l’Emploi et les Conditions de Vie des COD M´enages HIES - Enquˆ ete par Grappes a ` Indicateurs CIV LFS - Enquˆete Nationale sur la Situacion de l’Emploi DJI HS - Enquˆete Djiboutienne aupr` es des M´ enages EGY LFS - Labour Force Sample Survey SWZ LFS - Labour Force Survey ETH LFS - National Labor Force Survey GNB HIES - Enquˆ ete harmonis´ee sur les conditions de vie des m´enages KEN HIES - Household Budget Survey LBR HIES - Household Income and Expenditure Survey MDG HS - Enquˆete p´eriodique aupr` es des M´ enages MLI LFS - Enquˆete Emploi Permanente Aupr` es des M´enages MRT LFS - Enquˆ ete Nationale de R´ erence sur l’Emploi et le ef´ Secteur Informel MUS LFS - Continuous Multi-Purpose Household Survey NER HIES - Enquˆ ete Nationale sur les Conditions de Vie de M´enages Continued on next page 25 Table A1 – continued from previous page ISO Source HS - General Household Survey NGA HIES - Socio Economic Survey HIES - Enquˆ ete Int´egrale sur les Conditions de Vie de RWA M´enages LFS - Enquˆete sur la Population Active SEN LFS - Enquˆete Nationale sur l’Emploi SLE HS - Integrated Household Survey ZAF LFS - Quarterly Labour Force Survey SDN LFS - Household Survey TZA HIES - National Panel Survey TGO HIES - Questionnaire des Indicateurs de Base du Bien-ˆ etre LFS - Enquˆete Nationale sur la Population et l’Emploi TUN LFS - Labor Market Panel Survey UGA LFS - National Labour Force Survey 26