Childcare and Mothers’ Labor Market Outcomes in Lower- and Middle-Income Countries


 Improving women's labor force participation and the quality of their employment can boost economic growth and support poverty and inequality reduction; thus, it is highly pertinent for the development agenda. However, existing systematic reviews on female labor market outcomes and childcare, which can arguably improve these outcomes, are focused on developed countries. We review 22 studies which plausibly identify the causal impact of institutional childcare on maternal labor market outcomes in lower-and-middle income countries. All but one study finds positive impacts on the extensive or intensive margin of maternal labor market outcomes, which aligns with findings from developed countries. We further analyze aspects of childcare design, including hours, ages of children, coordination with other childcare services that may increase the impacts on maternal labor market outcomes. We conclude with a discussion of directions for future research.


Policy Research Working Paper 9828
Improving women's labor force participation and the quality of their employment can boost economic growth and support poverty and inequality reduction; thus, it is highly pertinent for the development agenda. However, most systematic reviews on female labor market outcomes and childcare, which can arguably improve these outcomes, are focused on developed countries. This paper reviews 22 studies that plausibly identify the causal impact of institutional childcare on maternal labor market outcomes in lower-and-middle-income countries. All but one study finds positive impacts on the extensive or intensive margin of maternal labor market outcomes, which aligns with findings for developed countries. The paper further analyzes aspects of childcare design, including hours, ages of children, and coordination with other childcare services that may increase the impacts on maternal labor market outcomes. The paper concludes with a discussion of directions for future research. This paper is a product of the East Asia and Pacific Gender Innovation Lab and the Gender Global Theme. . It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/ prwp. The authors may be contacted at dhalim@worldbank.org and eperova@worldbank.org.

INTRODUCTION
Greater female engagement in the labor market results in a number of benefits for countries and their residents. Hsieh et al. (2019) show that better allocation of women's talent across sectors, including leaving the home sector for market work, significantly contributed to the US growth between 1960 and 2010. Increased growth in female labor income in Latin America between 2000 and 2010 accounted for 28 percent of the reduction in inequality and 30 percent of the reduction in extreme poverty (Diaz and Rodriguez-Chamussy 2016). In East Asia, increasing female labor force participation (FLFP) has been shown to be more effective in mitigating projected decline in the labor force due to aging population than increased immigration or delayed retirement (World Bank 2016). At the household level, women's participation in the labor market helps households diversify income sources, allowing them to better insure against risk (Blundell, Pistaferri, and Saporta-Eksten 2016). However, mothers' traditional role in caring for children can be an impediment to their labor market engagement. In this paper, we review research that examines how improved access to institutional childcare impacts mothers' labor market engagement in lower-and middle-income countries: labor force participation (LFP) and employment as well as hours of work, type of employment and income.
A body of research from upper-income countries has documented that government policy around childcare has effectively contributed to improving women's labor market outcomes, such as LFP, employment and work hours. (See reviews from Del Boca 2015; Hegewisch and Gornick 2011;Morrissey 2017.) Both the macro and micro economic literature tends to find overall positive effects of subsidized childcare on female and maternal employment in higher income countries, particularly in countries with relatively higher costs of childcare at baseline (Olivetti and Petrongolo 2017). Family-oriented policies (including family subsidies that could be used for childcare) along with labor market institutions (such as employment protection) explain almost 25 percent of the actual increase in LFP for young women in 15 European Union countries (Cipollone, Patacchini, and Vallanti 2014).
The relationship between maternal labor market outcomes and access to childcare may be different in developing countries. First, the nature of jobs available for women in low-and middle-income countries may be compatible with simultaneously minding children: agricultural work or selling goods at a local market can be done with children alongside. Indeed, Aaronson et. al. (2021) find that fertility, which is proportional to maternal years devoted to childcare, is not significantly related to female labor supply in developing countries, only in developed countries. Examining data from 103 countries (both developed and developing) over 200 years, they conclude that there is greater compatibility of childcare with agriculture. FLFP may be already high in poorer nations due to the rural nature of their economies, and less dependent on availability of childcare services. Aligned with this hypothesis, in a country-level analysis examining differences in preschool enrollment between 1965and 1980, O'Connor (1988 argues that increases in preschool enrollment are likely driven by increases in women's participation in industrial and service sectors. Second, women's labor market response to childcare supply or costs depends on the current level of female participation in the labor force (Akgunduz and Plantenga 2018). Thus, in comparison to findings from high income countries, we may expect greater heterogeneity in impacts of childcare across different low-and middle-income countries, given a wider range of initial FLFP levels.
Thus, we may expect the maternal labor market response to the availability of childcare in lowand middle-income countries to differ from the well-established positive relationship found in high-income countries.
While country-level studies of impacts of childcare on maternal labor market outcomes abound, there have been few attempts to systematize this evidence from lower-and middle-income countries. Initial review of the evidence by Todd (2013) confirms the general finding that increasing availability and lowering the price of childcare increases maternal labor force participation (MLFP) in developing and transitioning economies. However, her review is part of a larger work on family-friendly policies and it is limited to the then scarce evidence on the topic of childcare, with only 3 studies having a causal inference research design. Burgeoning studies on childcare and maternal employment with more-plausibly-causal designs have been implemented since her 2013 research.
We have more overlap with the meta-analysis done by Harper, Austin, and Nandi (2017) -9 of the 13 studies they include-who provide an estimate of aggregated effect sizes of childcare and MLFP in lower-and middle-income countries. Limiting their meta-analysis to the 9 quasiexperimental or randomized studies, they calculate that a 30 percent increase in daycare utilization increases maternal employment by 6 percentage points, indicating responsiveness of women's labor supply to childcare availability. Evans, Jakiela, and Knauer (2021) review to what extent evaluations of early childhood development (ECD) interventions analyze impacts on mothers and conclude that only 4% of 478 studies they reviewed analyzed maternal labor market outcomes. As they focus on ECD, only 5 studies from our review are included in their work: we also include childcare interventions without an explicit ECD objective.
Our work extends earlier reviews in several respects. First, we include a larger number of studies (22) but limit them to the ones which plausibly identify causal impacts, excluding correlational studies. Second, in addition to standard institutional care in which paid workers use a facility specifically for childcare, our review includes primary school and after-school care, childcare provision at the location of employment, community centers, and community mothers' homes. Third, while the theme of increased MLFP does unite the literature in all the papers included in this review, we also discuss hours worked, income, and type of work, though these outcomes are examined less consistently throughout the literature. Our strongest contribution is a discussion of conditions and specifics of childcare implementation that may affect the magnitude of the impacts. Thus, this review may provide additional insights to policy makers concerned with the specifics of the design of institutional childcare programs.

4
As an organizational framework, we group studies by econometric approaches used to identify causal impacts, providing a brief discussion on the limitations and advantages of each approach. After the review of findings by type of econometric analysis, we highlight some contextual differences and details of implementation that appear in the studies and may impact the strength of the impact of childcare on maternal labor market outcomes. We conclude with a discussion of additional questions for future research and a summary of findings.

METHODS
The "universe" of considered studies was generated from searches on Google Scholar using keywords such as "childcare" and "preschool" in tandem with "female labor force participation", "maternal labor supply", and other synonyms. Based on their titles, we selected papers for further consideration. From this initial list of papers, we ran forward and backward citation checks to identify other potentially relevant articles until we found no further citations. From these, we selected studies that focus on lower-and middle-income countries. 2 Although we considered lowincome minority populations within wealthy countries (e.g. Inuits in Canada (Feir and Thomas 2019), Arabs in Israel (Schlosser 2011)), many studies from wealthy countries do include a component examining the poorest populations, whose demographic is often minority; this criterion would extend our scope too broadly so we decided not to include these studies. We also limited our scope by excluding studies that only evaluate informal care or paid nannies.
We limited studies to empirical works and do not review the qualitative contributions. We included studies with quantitative methodology specifically examining the causal impact of childcare on maternal labor force outcomes. We include studies that have Randomized Controlled Trials (RCTs), Regression Discontinuity (RD), Difference-in-Difference (DiD), Quasi-Experimental, and Instrumental Variables (IV) designs. We exclude studies on female labor market participation using cross-sectional data that are not in a natural experiment framework. We excluded one randomized control trial paper due its small sample size. 3 We include unpublished theses, working papers, and conference papers in order to cover a wide range of evidence and to minimize publication bias toward statistically significant results, although the majority of findings are statistically significant. Once we limited the set of studies to those that plausibly identify causal impacts, we found data quality sufficiently high; thus, no paper from the causal set was dropped on the basis of data quality. We limited our documents to work published in English and Spanish. Our final selection is found in Table 1.
We first organize our review by econometric analysis types, which has two benefits with regard to comparability across studies. First, for each methodology we note the key assumptions required for a causal interpretation of the results and comment on the degree to which the authors examine that their data fits these criteria, allowing for some reflection on the strength of the conclusions. Second, because of the broad variety of approaches to assess the impact of childcare on maternal labor market outcomes, grouping by analysis type allows for some degree of comparability within each section. For example, a mother receiving a voucher for childcare is a very different type of 5 analysis than that of a municipality expanding its childcare centers such that there are 15 more seats per 1,000 children. Because papers using the same empirical approach tend to have similar treatments, the results are more comparable within each methodology than across methodologies.

RESULTS BY ANALYSIS
The 22 papers included in our review that attempt to identify impacts of institutional childcare rely on a wide range of methodologies: Randomized Controlled Trials, Regression Discontinuity, Difference-in-Difference, Quasi-Experimental, and Instrumental Variables designs. Each of the methods offers advantages and limitations for interpretation of findings for policy design. Two dimensions are of particular importance: plausibility of causal interpretation and external validity. We organize our discussion below moving through this continuum.
We start by discussing RCTs, the gold standard for causal interpretations but the findings of which are frequently limited to a specific population and may not warrant extrapolation to a similar intervention implemented at the national scale. Three studies are included in this category. We next discuss RD design studies, which largely rely on discontinuity in the age of eligibility for preschools. In this category, again, a number of assumptions need to be tested for causal interpretation of impacts (such as continuous density around the cutoff point and similarity of individuals on either sides of the cutoff point). Similarly, these studies provide estimates around the discontinuity, which may not lend to an extrapolation for populations further away from discontinuity (e.g., for mothers of children of different ages). Three studies in our review fall into this category.
The largest category in our review relies on DiD design (either double or triple differences). It includes nine papers, which largely take advantage of national rollouts of institutional childcare programs. These papers offer estimates that are more likely to be valid for the national level programs. 4 However, a number of assumptions need to hold for the resulting estimates to be interpreted as causal. Some of these assumptions are not testable; for example, we cannot test if the error term conceals unobserved heterogeneity correlated with preschool rollout and maternal labor market outcomes. Others, such as the parallel trends assumption, are testable but they are not consistently tested in every paper in the review. We point to these limitations when discussing the results.
The last two categories rely on quasi-experiments using waitlists (two papers) and IV (five papers) approaches. Both approaches require more stringent assumptions for causal interpretations, compared to RCTs and RD designs; and unlike DiD designs, they are limited to specific populations. 5 We discuss authors' approaches to test assumptions needed for causal interpretation, as well as external validity concerns, within each category, and weave these considerations into the discussion of results. The studies are listed in Table 1.

Randomized Controlled Trials (RCTs)
RCTs are the gold standard for determining the impact of interventions. Random assignment of the treatment (childcare provision in this case) ensures that there are no concerns regarding selection biases. This is often confirmed by checking equality of pre-treatment characteristics across the treatment and control groups and controlling for variables that may be imbalanced. However, there are some limitations to these analyses, most important of which being external validity concerns: conclusions are generally restricted to a local population.
Three studies in our review used a randomized controlled trial methodology: Barros et al. (2011) in Brazil, Clark et al. (2019) in Kenya, and Martinez and Perticara (2017) in Chile. The focus was among low-income residents in Nairobi, Kenya, and in Rio de Janeiro, Brazil. In contrast, the Chilean program was implemented in 21 municipalities and the authors show the program schools have similar demographic characteristics to the national average. The programs differed in the type of childcare provided and the ages of children served. The Brazilian and Kenyan programs provided full-time childcare for very young children (ages 0-3 years and ages 1-3 years respectively) while the Chilean care was an after-school program for primary school students (aged 6 to 13). The randomization process across programs differed in the sense that in Brazil and Chile, children were randomly selected among those whose parents expressed interest in participating in the programs and registered for it. In Kenya, the sample was selected among women who did not have their children enrolled in daycare but the authors did not have any information about their interest in participating in the program. 6 Two RCTs, in Kenya and Chile, collect baseline data and demonstrate that treatment and control groups are balanced. The RCT in Brazil only includes one round of data collection, hence testing for balance based on pre-treatment characteristics is not possible. 7 Attrition rates were coincidentally similar in all three studies: around 13%. The Chilean and Brazilian studies demonstrate no differential attrition; in Kenya, where younger and more educated women assigned to the control group are more likely to be lost to follow up, the authors control for these characteristics in their regression analysis.
All three studies show positive impacts on maternal work; studies in Brazil and Chile also show increases in LFP rate, but this outcome is not included in the Kenya paper. The results are of similar range: Intent to Treat (ITT) estimates are 3, 4 and 10 percentage points in Chile, Brazil and Kenya, respectively (Table 2). However, the differences loom larger as a percentage of control group average: the effect size in Kenya is nearly three times as high as in Chile and twice as high as in Brazil. Local Average Treatment Effects (LATE)/Treatment on the Treated (ToT) 8 estimates are also similar in terms of absolute magnitude, though they differ when expressed as a percentage of control group average.
We also note that for all studies LATE/ToT are between 2 and 4 times higher than ITT estimates. We interpret it as suggestive of the fact that absence of childcare is a binding constraint only for a subset of eligible women: not all of them end up using childcare. Heterogeneity analysis from two studies is consistent with this reasoning: in Brazil and Chile, the impacts are higher for specific subgroups. In Brazil, there was an almost doubling in the employment of mothers who were not working before the lottery took place (from 9 to 17 percent), but there were no statistically significant impacts on hours worked for mothers who were employed (Barros et al. 2011). In Chile, the impact of after-school care was stronger for women who had children under 5, and also influenced the likelihood of enrolling these younger siblings in public daycare: once constraint on work from older child was removed, women appear to choose to remove the same constraint imposed by another child.
Results on income show a sizeable increase in household incomes of 16 percent in Brazil (Barros et al. 2011). A similar sized increase is observed in Kenya: mothers receiving childcare voucher have 24 percent higher monthly earnings. The earning gains are similar for married and single working mothers, although, single mothers did not significantly change their employment status. However, single working mothers receiving subsidized childcare benefitted by working fewer hours than those not given childcare without any loss to their earnings; they shift to jobs with more regular hours (Clark et al. 2019). In Chile, there was also no impact on income (Martinez A. and Perticara 2017); the same mechanism may have been at work as among single mothers in Kenya.

Fuzzy Regression Discontinuity in Eligibility Based on the Child's Age
The regression discontinuity approach uses an eligibility cut-off to contrast very similar groups on either side. The age cut-off is often used in education analysis to contrast children with birthdays right before a school enrollment cut-off to children with birthdays right after. When the age cutoff is observed strictly, children with birthdays prior can enroll in school, but those after cannot. Thus, those with birthdays right after the cut-off have one year less of publicly provided childcare, which creates a sharp discontinuity in the likelihood of receiving care. However, pre-primary schooling may not be compulsory 9 and there may be an optional enrollment birthday window, so parents with children whose birthdays fall between these dates may decide that their children wait another year for school. Thus, there is not a perfect alignment of treatment across the age cut-off, but there is a dramatic discontinuous drop in the probability that the child will be enrolled. This results in a "fuzzy" RD design.
This type of analysis relies on two assumptions. First, there is a continuous density of observations from the left to the right side of the age cutoff. In other words, people do not strategically try to manipulate the month when their child is born to be more eligible for childcare. Second, individuals close to the age cutoff should be very similar, except for the childcare treatment. While similarities in observable characteristics can be tested, overall similarity ultimately has to be assumed because some characteristics unobservable to the researcher may be discontinuous at the cutoff. The assumption is more plausible if restricted to narrow bandwidths on either side of the cutoff. However, this restriction means that the estimated treatment effect can only be estimated "locally" within that narrow window and for those receiving childcare ("compliers").
Studies from Argentina (Berlinski, Galiani, and Mc Ewan 2011), Brazil (Ryu 2020), and Vietnam (Dang, Hiraga, and Cuong 2019) use this approach to contrast the employment of mothers of children born on either side of the preschool enrollment cut-off. In Vietnam, all children 0-5 were included in the sample, taking account the September cut-off for primary school enrollment; differences in age were also explored separately (i.e., age 2 vs. age 3 in September). In Brazil and Argentina, children above and below the compulsory education enrollment cut-off of age 4 were studied. Similar to the balance checks in the RCT approach, Berlinski, Galiani, and Mc Ewan (2011) and Ryu (2020) confirm the smoothness of covariates around the birthdate of interest, but Dang, Hiraga, and Cuong (2019) use only a small set of exogenous covariates.
In Vietnam and Brazil, there was no significant impact of childcare provision on the likelihood of mothers working. Dang, Hiraga, and Cuong (2019) theorize that this is because over 90 percent of mothers are already working in Vietnam. Nevertheless, in Brazil 50% of the mothers were working and in Argentina only 30%-50% of mothers were working, and these two studies had contrasting findings: there was no change in employment in Brazil, while in Argentina access to childcare enabled 13 mothers to start work for every 100 children that start preschool.
The Vietnam and Brazilian studies offer insights relating to employment quality that are not often explored. In Vietnam, childcare provision results in an increase in the probability of wage-earning employment (41%), formal employment (26%), and long-term (2 years) participation in the labor market (38%). All of these shifts result in movement out of farm employment, likely into higher productivity jobs. The probability of being poor is also reduced by 22 percentage points. In Brazil, there was a 46% increase in the probability of being formally employed, though only among women with no other younger children and no relatives in the household, and conditional on participating in childcare. The Argentinian study does not focus on job quality, but it does find that the probability of working more than 20 hours per week 10 increased by 19 percentage points for mothers whose children attended preschool.
The most unique finding relating to hours worked among this set-and indeed among the entire set of studies-comes from Brazil; among mothers without additional younger children who took up work the increase in hours worked corresponded almost exactly to the child care time (Ryu 2020). Preschool is about four hours per day and the increase in hours worked by these mothers was 23 hours. There was a corresponding decrease in household chores of 18 hours. No other study finds such strong alignment, though in some cases this is because estimates are for the population as a whole rather than for the specific group of mothers who entered the workforce.
All three studies examine various bandwidths to check sensitivity of the estimates and do placebo tests using alternate birth months (e.g., should a one-month difference in age impact maternal employment), providing confidence in their estimates. Again, these studies provide localized estimates (LATE) for mothers of children of a specific age, and they do include the full national population. 9

Difference-in-Differences
Although many of the previous studies mentioned in this review have used multiple years of data, this temporal aspect has not been a key component of the estimation strategy. With the differencein-differences approach, changes over time are contrasted. However, in order to avoid other factors influencing the outcomes over time, there is an additional contrast between treated and untreated groups, where treatment is access to childcare. This strategy has been used in many contexts where government has scaled up childcare programs, resulting in variation in coverage over both time and place. We summarize the policies used in DiD studies in Table A1 rather than providing details on each of them in the text. These studies often examine large-scale government changes using national data, resulting in estimates that are more generalizable than those from RCTs or RDD studies.
For the change between the outcomes of the treated and untreated groups to reflect causal estimates two assumptions need to hold. First, the changes between the groups prior to increase in availability of childcare need to follow a similar trajectory. Second, there cannot be time-variant factors simultaneously correlated with access to childcare and the outcome variables, which researchers do not observe and cannot control for.
The first assumption can be tested if at least two data points are available prior to rollout of a childcare program. The second one is not a testable assumption. To increase the likelihood that the second assumption holds, some studies refine control groups by adding another dimension in a triple difference framework, such as age of children eligible for institutional childcare. The additional dimension reduces the likelihood that some unobservable factor, which is correlated with access to preschools affects women's labor market outcomes. Because access to preschool in a triple-difference framework is defined through a combination of three factors (for example, time of the reform, its geographical coverage and age of children), the likelihood of confounding factors varying similarly along these three dimensions is reduced.
Among 9 papers reviewed, only 3 used a triple differences approach, which we consider a stronger strategy for causal identification, compared to a double differences approach. Calderon (2014) exploits rollout of preschools for low-income families in Mexico, De la Cruz Toledo (2015) relies on changes in Mexican legislation that made preschool education compulsory in 2002, Halim, Johnson, and Perova (2021) examine impacts of public preschool expansion in Indonesia. However, only Halim, Johnson, and Perova (2021) provide convincing evidence that pre-treatment trends in the outcome variables were similar. Calderon (2014) graphically show diverging trends between eligible and synthetic control 11 prior to the treatment. De la Cruz Toledo (2015) does not include a parallel trends test. Of the 6 double difference studies, only Padilla-Romo and Cabrera-Hernandez (2018) reference a parallel trends test, but do not present the results in the paper.
When discussing results, for comparability we group the studies along two dimensions: treatment variable (categorical or continuous) and the type of intervention (availability of a childcare program or expansion of school day).
Two studies which use categorical variables focus on availability of childcare: they contrast maternal labor force outcomes when a community is with or without a preschool in China. Estimates are 10% and 20% increases in mothers' labor force participation associated with presence of a childcare center in the community (respectively Du and Dong 2013; Kilburn and Datar 2002). The 10-percentage points difference in impact may be attributable to the fact that the sample in the Kilburn and Datar (2002) study had a higher baseline MLFP rate (91%) than had the sample in the Du and Dong (2013) study (82%) which may have been because they used earlier years of the survey, when funding for education was not yet as restricted.
The remaining seven DiD studies examine continuous variables of childcare coverage. To make impacts on mothers' labor force participation comparable across studies, we consider the estimated effect of a change from 0 to 100% coverage of childcare, i.e., all eligible children participate in care services. 12 Two studies focus on expansion of childcare for children under 5: in Mexico and in Chile. Both programs targeted children of low-income families. In Mexico, Calderon (2014) finds that an increase from no childcare to 100% coverage of children aged 1 to 3 would increase the maternal labor force participation rate by 15%. In contrast, child care for children ages 0-4 years in Chile was not associated with any increase in maternal labor supply (Medrano 2009). Notably, Medrano (2009) provides several robustness checks to test the consistency of the results, some of which yield negative estimates of childcare availability on maternal labor market outcomes. We agree with her interpretation of the conjunction of specifications as indicative of no positive impacts.
Three studies evaluate impacts of preschool expansion for older children. Results for children 3-5 years in Argentina (Berlinski and Galiani 2007) and ages 3-4 in Mexico (De la Cruz Toledo 2015) are similar; a 0-100% increase results in a 7-14 percentage points increase in maternal employment in Argentina and a 15 percentage points increase in Mexico. In Mexico the magnitudes are smaller for 3-year-olds and larger for 4-year-olds; a change in preschool coverage from 0 to 100% for 4year-olds would increase employment of mothers by 15 percentage points; the increase is smaller for mothers of 3-year-olds at 8 percentage points (De la Cruz Toledo 2015). In Indonesia, an additional preschool per 1,000 eligible children aged 3 to 5 increases maternal work by 4.8 percentage points (Halim, Johnson and Perova, 2021). Assuming that 1 preschool serves 150 children, providing spots for all eligible children would increase maternal work by 32 percentage points -twice the order of magnitude found in Latin American studies.
The estimated impacts vary more when considering the extension of the primary school day in Argentina and Mexico. For example, Berthelon, Kruger and Oyarzun (2015) determine that if the portion of municipalities with full day school coverage increases by 45 percentage points-which is the change required to move all Chilean municipalities to full coverage-mothers' labor force participation would increase by 11.9 percentage points. Expansion of the primary school day length in Mexico, on the other hand, only yielded a 5-percentage point increase in maternal labor force participation rate (Padilla-Romo and Cabrera-Hernández 2018).
A subset of these studies examines outcomes beyond labor force participation and employment, such as hours of work, income, and the types of work women go into. Most studies that examine hours worked find increases in women's non-domestic work, with estimates ranging from 0.5-6 hours per week (Berlinski and Galiani 2007;Calderon 2014;Du and Dong 2013;Padilla-Romo and Cabrera-Hernández 2018). No impact of childcare for children 3-6 years was found on hours worked in Indonesia, however (Halim, Johnson, and Perova 2021). Moreover, a decrease in hours was found among full-time Mexican working mothers of children ages 3-5 years, suggesting the public provision of childcare was causing women to change from informal care, possibly provided by relatives, to formal childcare. The authors interpret the results to indicate that the stricter schedules associated with formal care required mothers to reduce their hours to accommodate these new arrangements (De la Cruz Toledo 2015). Interestingly, Mexican men also reduced the time spent doing childrearing and housework when their children were in care (Calderon 2014).
Only three studies examined impacts on income. In Mexico, income increased slightly among mothers of children ages 1-3 years who were not earning prior to their children participating in institutional care (Calderon 2014). Examining the entire household in this context, Calderon concludes that women's employment allowed for some husbands to switch into better jobs while other husbands earned less. No change in income was found in Indonesia (Halim, Johnson, and Perova, 2021). Finally, and most notably, the expansion of the Mexican elementary school day was found to result in a 22% increase in earnings overall with a 36% increase in high poverty areas (Padilla-Romo and Cabrera-Hernández 2018).
Two studies include additional outcomes: type of work and labor force attachment. In Indonesia, mothers primarily take up unpaid family work, which may be due to the fact that preschools operate daily for less than half a day (Halim, Johnson, and Perova 2021). As mentioned earlier, the study also finds no impact on maternal income, nor on hours worked. In Chile, greater access to longer school days results in more permanent labor force attachment; with an expansion of 45 percent of schools participating in the longer school day(the amount that would bring all schools up to full day) , both the probability of women participating for at least six months during the year and the fraction of the year they work or seek employment increase by 19 percentage points-the equivalent of two months of work (Berthelon, Kruger, and Oyarzun 2015). These impacts are completely driven by mothers of children in 1 st and 2 nd grades. These findings are particularly notable because the longer school days only extend from 1:30 pm to 3.30 pm. Hence, a childcare gap from 3.30 pm to 6 pm remained and mothers participating in full-time employment will need to acquire additional childcare to cover the remaining gap.

Quasi-Experiments Using Waitlists
Two papers use the waitlist feature of childcare programs in order to assess the impact of receiving childcare, comparing the labor market outcomes of women with children in care to women on the waitlist. The waitlist natural experiments create a discontinuity in receiving childcare between those who do receive it and those who are on the waitlist. However, for the waitlist natural experiment to have valid causal conclusions, it relies on the strong assumption that mothers on the waitlist are comparable-on all dimensions-to those receiving childcare, except for the childcare access. To certain extent, this can be tested, by checking for balance (i.e., similarities) of observable characteristics. However, it is possible, depending on the waitlist rules, that there could still remain some unobserved selection bias if women who signed up earlier (and thus are more likely to receive childcare and are less likely to be on the waitlist) are also more likely to seek or acquire employment. Angeles et. al. (2011) evaluated the government of Mexico's Program of Childcare Centers (Programa de Estancias Infantilies -PEI) by contrasting employment outcomes of women whose children got a spot to women whose children remained on the waiting list. As the treatment was not randomized, they did extensive comparisons of characteristics of women on the waitlist to mothers who were in the program. They found a few statistically significant differences, but not ones they considered to be economically significant, with the exception that mothers on the waitlist lived in households with a male head at a higher percentage (78%) than mothers with childcare (70%). This could imply that women without as much male income in the household had a greater need to work, somewhat impacting their results. However, the magnitudes of the results are larger than this difference between households, suggesting that this bias is not driving the findings. Results indicate that the childcare program increases employment by 18%. Angeles et. al. (2011) also found improvements in work stability and an increase in working hours. Benefits were greatest to women who had not worked prior to the childcare program.
An alternative strategy is to compare mothers to themselves at multiple time periods: when they were on the waitlist and when their children were in care. The individual fixed effect approach reduces selection bias because the mothers' unobservable characteristics are the same in both cases, as long as those characteristics are time invariant. Ranganathan and Pedulla (2021) used this approach to study how mothers of young children who work in an Indian garment factory respond to availability of factory-provided childcare. 13 Ranganathan and Pedulla (2021) take advantage of one factory's limited capacity to contrast women's work attendance before and after their children were accommodated in the childcare facility. When women receive access to employer-sponsored childcare, their odds of being present at work are 1.71 times higher than when they do not have access to such childcare provisions. However, this identification strategy relies on the assumption that workers cannot anticipate when the childcare slot becomes available. Ranganathan and Pedulla (2021) explain that it is unlikely that mothers would know when vacancies would arise due the large size of the factory and because they cannot observe the satisfaction of other workers with the program.
Although Ranganathan and Pedulla's (2021) outcome in their Indian study is not directly comparable since they measure days missed at work rather than focusing on a binary employment outcome, the qualitative conclusion remains that childcare provision supports more regular work. Similar to the Brazilian and Chilean RCT studies (Barros et al. 2011;Martinez A. and Perticara 2017), the estimates of these quasi-experimental studies are only for women who have actively sought out child care services.

Instrumental Variables
The final set of papers use the instrumental variable (IV) approach. This type of analysis eliminates endogeneity between the variable of interest (access to childcare) and the outcome variable (mothers' labor market outcome) by using a valid instrument to isolate the effect of the explanatory variable on the outcome variable. 14 This means that a valid instrument must, first and foremost, be correlated with the explanatory variable of interest (relevance assumption). Moreover, a valid instrument also needs to affect the outcome of interest only through the explanatory variable channel (exogeneity assumption). In other words, a valid instrument is uncorrelated with any other potential determinants of the outcome variable. However, finding a valid instrument is often a difficult task. While the first assumption can be easily tested, 15 the second assumption is, by design, not testable because many potential determinants of labor market outcomes, such as ability and preference for employment, are not observable.
Two of the IV studies use distance to childcare centers. Attanasio and Vera-Hernandez (2004) evaluate the national program Hogares Communitarios (Community Homes) in rural Colombia, where parent groups select a 'community mother' and pay her a small monthly fee for her to care for their children ages 0-6 years; the government provides food for lunch and snacks. The authors show that distance to childcare centers is indeed strongly correlated with children attending community homes, satisfying the relevance assumption.
However, the exogeneity assumption is violated if the government targets locations where parents can benefit the most from community homes; or if households move closer to community homes. The authors argue that because it is often difficult to meet the minimum number of children to form a new pod, the location of community homes often stay the same over the years. They further show that households who move do not move to be closer to childcare centers. Unfortunately, given the paper's main focus on children's outcomes, they do not discuss other potential determinants specific to maternal employment. Notably, mothers who live closer to town may have better access to employment opportunities. Assuming that this and other unobserved factors are satisfactorily addressed by the included control variables, the authors find that, with childcare attendance, the probability of maternal employment increases from 0.12 to 0.37. In addition, the program increases the number of hours worked by 75 hours per month (Attanasio and Vera-Hernandez 2004). At almost 20 hours per week this increase is quite notable, but still is less than the time the child is in care since the Hogares Communitarios cover more than a half day.
In China, Du, Dong and Zhang (2019) try community-level distance to the nearest daycare and median local prices of daycare as instruments; only the former is shown to be relevant. 16 The authors do not explain why distance to daycare and median local prices are plausibly exogenous in the context of urban China, and instead rely on the inclusion of community and province fixed effects to account for any other potential determinants that are time-invariant and common within region. However, if fewer daycare centers are operational over time due to decreased demand or if median daycare prices covary with wages over time (both of which likely also influence mothers' decision to work), then the exogeneity assumption would not hold. Nevertheless, the authors find that access to a childcare center increases MLFP by 24-29 percent.
A recent study of Nicaragua's publicly provided daycare Programa Urbano leverages a combination of imperfect random assignment of community daycare centers (Centros Infantiles Comunitarios) and distance to daycare centers (Hojman and López Bóo 2019). Programa Urbano targets extremely poor families in urban neighborhoods and serves children ages 0 to 4 for half a day, five days a week. The randomizations are intended to be phased-in, with control communities receiving daycare centers a few years later. Unfortunately, there was an imperfect compliance with: (i) control communities getting community daycare centers, (ii) children in untreated areas attending daycare, and (iii) children in treated areas not attending daycare. To address this, the authors use the random assignment as a valid instrument for daycare enrollment; they complement the analysis with distance to daycare centers as an additional IV. While the second IV is more contestable, they argue that the daycare location was not chosen in any special way and that the results with and without the second IV are comparable. Hojman and Lopez-Boo (2019) find that daycare centers lead to 14-percentage point increase in mothers' work participation.
A fourth paper studies childcare centers in Sao Paulo using the share of people on the waitlist as an IV for childcare access (Sanfelice 2019). The city of Sao Paulo develops a waitlist system, which depends on a combination of age, location, and enrollment date. For younger ages, fewer students are assigned per instructor, which then reduces the number of available slots. Children living nearby and who are enrolled earlier are prioritized when there is an available slot. Sanfelice (2019) exploits this exogenous variation in the competitiveness to get a childcare slot, by comparing observationally equivalent mothers of children ages 0 to 4 living in the same neighborhood who differ only in their chances of obtaining a public daycare slot because their children differ in age and therefore face different waitlist lengths. The exogeneity of age is likely to hold unless mothers can falsely report the age of their child. Sanfelice finds that the use of center-based care increases the probability of maternal employment by 44 percentage points among compliers; mothers are more likely to work full time and are more likely to work in the formal sector.
In Ecuador, Rosero and Oosterbeek (2011) contrast a home visiting program with childcare centers. Ecuador's program Funds for Child Development services younger children and has a national child support program that offers two types of child development support: home visits and childcare. Communities applied for funding for the different programs and were scored on a variety of criteria, such as socioeconomic characteristics of the neighborhood, coherence of the proposal, quality of personnel, and others. As in a discontinuity design, the authors instrument program receipt with a variable indicating whether the community has scored above or below the scoring cut-off. However, unlike a discontinuity design, they do not restrict the sample to those within a certain distance from the cut-off. The proposal score would not be a valid instrument, for example, if communities writing a winning proposal are more educated and have better access to employment opportunities for mothers.
Conditional on the instrument being valid, the authors find that home visits positively impact children's cognitive outcomes but negatively impact the likelihood of mothers working, while childcare centers increase maternal employment by 22 percentage points but do not benefit children's cognitive outcomes. Both interventions affect hours worked per week: childcare centers trigger an increase of 7 hours per week, but home visits result in a reduction of 4 hours per week.
Mothers' income is not impacted by either policy, but household income increases by about $60 per month. This finding may result because the entire family works in a small household enterprise so income is not attributed to any specific individual though it may be the mothers' work that increases the household income. Finally, it is important to note that the study from Ecuador is one of the few papers that examines health impacts on the mother and finds an increase in stress associated with childcare centers and a decrease in stress associated with home visiting (Rosero and Oosterbeek 2011).
While the IV approach seems to offer more flexibility in dealing with data limitations, finding a credible and convincing IV is challenging. In most cases, results hold conditional on rather strong assumptions.

Summary of Findings
Examination of studies within and across several methodologies suggests a number of broad conclusions. First, overall, childcare provision positively affects maternal labor market outcomes. All studies with the exception of Medrano's 2009 paper on Chile find positive impacts on maternal labor force participation and/or employment. However, the magnitudes of impacts vary widely across contexts. The next section will discuss delineating design features and circumstances that may have contributed to the size of the impacts.
The impacts appear to be driven by sub-populations. This is evident from comparison of ITT and ToT estimates in RCTs, heterogeneity analysis, and the relatively low magnitude of impacts of the studies which exploit national expansion of preschool programs. While the absence of childcare may be a key barrier for joining labor market for some women, it may not have bearing for labor market choices of others. Alternatively, some women may face multiple constraints; for example, they may have children of different ages who need care, and a specific childcare program would only address the care deficit for one of them.
It is also important to note heterogeneity in program implementation: while all studies reviewed provide some childcare, they vary in terms of the number of hours, age eligibility, pairing with other childcare programs (e.g., self-standing preschools vs. extension of the school day). The relatively low number of studies, and relatively high range of implementation arrangements preclude a rigorous analysis of relationships between specific features of childcare provision and the magnitude of impacts. However, in the next section, we present some observations regarding associations between context, specific design features, and program impacts.

CONTEXT, DESIGN FEATURES, AND IMPACT
Several reviews have generally concluded that investing in quality childcare promotes early child development and is cost effective investment in human capital in the long run for low-income settings (Duncan and Magnuson 2013;Patrinos and Psacharopoulos 2020). Our review contributes in documenting additional benefits by highlighting the general findings of institutional childcare on maternal labor market outcomes, which overall confirm a positive, causal relationship between childcare and maternal labor market engagement. Several themes emerged in the papers we reviewed that suggest childcare design choices or contextual factors may strengthen impacts on mothers' labor market outcomes in addition to the impacts on children.

Designed with or without the Mother in Mind
As Blau and Currie (2006) note, expansion of childcare may pursue a dual objective: improvements in early childhood development as well as maternal labor market engagement. Among the programs included in our review, some were explicitly focused on improving maternal participation in the labor market (such as Estancias Infantiles in Mexico and India's law requiring factories to provide childcare) while others predominantly pursued the objective of improving early childhood development and equalizing learning opportunities (such as the expansion of childcare for low-income families in Chile or the construction of preschools in Indonesia). In the extreme, Ecuador's home visiting program supported child development at the cost of mothers' working hours. We note that the impact on maternal labor market outcomes may depend on whether mothers were taken into consideration at the design stage. The studies we reviewed point to three aspects of design that appear to affect maternal labor market engagement: hours of operation, distance, and intra-program coordination.
Childcare services that exclusively pursue the objective of early childhood development typically operate for less than a full workday. This includes the only program in our review that did not have impact on MLFP: the childcare expansion in Chile, which provided care between 8:30 am and 3:30 pm-shorter than a 9 am to 6 pm Chilean workday (Medrano, 2009). In Indonesia, preschools are available for 3-5 hours per day, and, while they increase maternal labor force participation, women join labor market as unpaid family workers (Halim, Johnson, and Perova 2021). In Brazil, compulsory preschool for 4-year-old children offered for four hours a day did not have impact on labor force participation (Ryu, 2020). However, it increased hours of work for already working mothers by 23 hours (similar to the amount of time spent by children in school) and facilitated switching to formal jobs. This could be because working mothers already had some informal care which was supplemented by the formal care arrangement.
Notably, although only a subset of studies analyzes impacts on income or earnings, the ones that find positive impacts evaluate programs that provide full-time childcare in Brazil, Vietnam, and Mexico (Barros et. al., 2011;Calderon, 2014;Dang et. al., 2019). Two of these studies-in Brazil and Mexico-explicitly pursued an objective of women's economic empowerment. Hours of operation beyond full workday also appear to matter; in Mexico, in addition to increasing employment, preschool expansion also reduced the number of hours worked for mothers who had been working before the expansion. The authors hypothesized that this decrease in hours worked was due to the need to pick up children from preschools at specific hours (De la Cruz Toledo 2015). Clark et. al. (2019) note a similar reduction in work hours among women who were employed before the program in Kenya. 17 Distance (either from home to potential employers or from home to childcare) also appears to be a factor affecting the impacts. Vietnam childcare provision was associated with an increase in wage employment; however, this increase was smaller when the family was located farther from town (Dang, Hiraga, and Cuong 2019). The authors hypothesize that this is driven by the lack of economic opportunities in more remote areas. In Mexico, children who live closer to childcare centers are more likely to be enrolled, and their mothers are more likely to work (Attanasio and Vera-Hernandez, 2004). Du, Dong and Zhang (2019) find that distance to childcare is significantly correlated with its use for younger children (0-2), although it does not appear to matter for children aged 3 or older.
Several studies also point to the importance of coordination of different types of childcare services. First, Berlinski, Galiani, and McEwan (2011) show that the impacts of preschool availability for children age 5 on maternal labor force participation and work hours in Argentina are driven by mothers who do not have younger children. Similarly, Ryu (2020) also finds impacts on women without younger children. Martinez and Perticara (2017) demonstrate connections between different types of childcare services in Chile; extension of school hours for children from 6 to 13 also increased enrollment of children under 5 years into preschools. Moreover, the impacts of the program were the highest for women who did not work and had children under 5. This result suggests that childcare services may be ineffective if they alleviate constraints imposed by care demands for one of women's children without addressing similar constraints associated with their other children. In Chile, preschool services were available prior to expansion of school hours, and apparently some women with older children did not take advantage of those services because they would not be able to work anyway.
Overall, the combined findings in this review suggest that explicit focus on maternal labor market engagement in design of childcare (such as adjusting hours of operation or coordinating different types of childcare services for different ages) has the potential to increase its impact on maternal labor market outcomes.

Child's Age
Our review suggests that the potential of childcare provision to increase maternal labor force participation is likely to vary depending on the ages of the children it targets. Our review includes studies of childcare provision to children aged 0 to 13. Different methodologies and differences in age bands of children included in the studies preclude us from explicitly comparing the effectiveness of childcare provision on maternal labor market outcomes at different ages. However, several broad themes emerge across studies suggesting that different policy tools may be needed for young children.
A few studies compare impacts of childcare for different age groups. In Mexico, maternal employment was more responsive when considering mothers of 4-year-olds in contrast to mothers of 3-year-olds (De la Cruz Toledo 2015). As discussed in the previous section, in addition to an increase in employment, the Mexican study also found a reduction in hours worked, likely to accommodate childcare center hours. Mothers of 3-year-olds reduced hours more than mothers of 4-year-olds, suggesting that mothers of younger children either preferred to pick up children themselves, or were leaving them in day care for shorter hours. In Vietnam, Dang, Hiraga, and Cuong (2019) examining impacts of childcare for children ages 1 to 5 find higher impacts for older children, although the differences are not statistically significant. Notably, these results are consistent with evidence from developed countries, where mothers also are more willing to place their older children in institutional childcare. For example, Denmark has the highest enrollment rate of young children ages 0-2 among OECD countries but this is only little more than 60 percent (Del Boca 2015). Rates of enrollment for Danish children 3-5 years are above 90%. In contrast, however, Hojman and Lopez-Boo (2019) find that impacts of childcare on work participation are similar for mothers of toddlers and mothers of 3-4-year-olds in Nicaragua.
Three policy-relevant considerations may explain the relatively higher effectiveness of childcare for older children. First, Ranganathan and Pedulla (2021) suggest that in India, there may be a strong social norm for maternal care for younger children. Thus, relatives may refuse to provide impromptu care for an infant while the mother works (such as when a regular caregiver falls ill), while they would be willing to do it for an older child. This lack of back-up options may drive women out of labor market completely. Second, the challenge of evaluating childcare quality for very young children may also contribute to difficulties in effective implementation: children ages 0-2-before mastering the ability to speak-cannot report on the events at the childcare center, requiring more trust on the part of the parents. Finally, infant care is generally more intensive than preschool care, so high-quality provision is most costly, which may make it less accessible. For example, in Sao Paulo, the public administration established strict child-to-instructor ratios of 7 for children younger than 2, 9 for 2-year-olds, 12 for 3-years-olds; children age 4 and older can be in classrooms of 25 students (Sanfelice 2019). Similarly, in Nicaragua, for children ages younger than 3, one educator is assigned per 8 children, while one educator is assigned per 18 children for ages 3-4 (Hojman and López Bóo 2019).
Notably, there appears to be another inflection point for effectiveness of childcare for maternal labor force participation; Berthelon Kruger, and Oyarzun (2015) find that expansion of school day in Chile affected labor market outcomes of mothers of 1 st and 2 nd graders only. Not surprisingly, as children grow, childcare becomes less of a constraint for labor market engagement of their mothers.
Overall, we conclude that childcare's impact on MLFP is strongest for the early childhood (3-8) age range; less strong is the association between childcare provision and maternal labor force outcomes for mothers of infants ages 0-2 or relatively older children in middle childhood. It is important to note, though, that excluding younger children would leave the motherhood penalty unmitigated, meaning mothers would not gain human capital while their children are very young and would be at risk of acquiring lower quality jobs when they return to the labor force. However, more research is needed on whether childcare attendance at earlier ages introduces a trade-off between mothers' work participation and children's long-term development outcomes. Some research has found institutional childcare not to be as beneficial for children ages 0-2 (Devercelli and Beaton-Day 2020). Evidence from France further suggests that starting childcare at age 2 has positive impacts of mothers' employment with no long-term adverse effects on children's education (Goux and Maurin 2010). Generalizing this finding to developing countries context where quality of childcare may be lower and less enforceable requires caution, however.

Targeting
Studies from upper-income countries cite that low-skilled women (usually poorer) often are most responsive to changes in costs and availability of childcare (Morrissey 2017); high-skilled women (wealthier) command higher wages such that their income exceeds the cost of care by a sufficient margin that they have incentives to work even when they have small children (Blau and Currie 2006). If this pattern is consistent in lower-and middle-income countries, it would provide additional justification for targeting subsidized childcare provision to low-income mothers.
Some consistency is found along this dimension in middle-income countries. Berthelon, Kruger, and Oyarzun (2015) find lower-educated mothers' labor force participation increases twice as much as that of higher-educated mothers in response to increases in the length of the Chilean primary school day. The effects of extending the school day in Mexico are also concentrated among poorer mothers as determined by the sample median, which was incomplete secondary; no impacts on labor market participation was found among higher educated mothers (Padilla-Romo and Cabrera-Hernández 2018). In Brazil, overall impact on women's employment is expected to be higher if low-income families receive priority, compared to allocation on "first come, first serve" basis, according to simulations carried out by Sanfelice (2019).
However, several studies find the opposite pattern: in Vietnam, more educated mothers are more likely to acquire wage work when their children are in preschool compared to less educated mothers; additionally the effect of childcare centers on FLFP is lower for ethnic minority women in Vietnam, perhaps because ethnic minorities are less likely to have similar job opportunities (Dang, Hiraga, and Cuong 2019). For programs which target low-income populations, like Mexico's Estancias Infantiles, there are no differences in employment impacts for women with and without high school education; however, more educated women experience larger impacts on the intensive margin; that is, women with high school education earned more (Calderon 2014).
A wide range of studies suggest that disadvantaged and vulnerable children benefit more from childcare provision compared to better off children (Devercelli and Beaton-Day 2020). Thus, targeting childcare provision to lower income households is justified for maximizing child development benefits. However, distinct from evidence from upper-income countries, the evidence from lower-and middle-income countries in this review suggests that impacts on maternal labor outcomes from childcare provision do not always concentrate among the poor. This may be due to the lack of labor market opportunities for low income/low education women. Consequently, while targeting low-income populations with institutional childcare is likely to be an optimal policy overall, considering heterogenous impacts on children and mothers suggests that complementary measures may be necessary in some contexts. Without supporting mechanisms (for example, skills training to expand labor market options), lower income women may not be able to fully reap the benefits of childcare.

Compulsory or Voluntary
One of the strongest policy stances would be to extend the years of public education, making preschool compulsory. Several studies from Latin America report that governments were initially interested in compulsory preschool, but ultimately relaxed their plans. For example, in Argentina, a law passed in 1993 to make one year of pre-primary education compulsory for 5-year-olds, but there has been no penalty for families that do not comply: their children are still admitted into primary school (Berlinski and Galiani 2007). Brazil also had the goal of making preschool compulsory by 2016, but this was not achieved (Ryu 2020;Sanfelice 2019). In Mexico, there was more success and one year of preschool was made compulsory, though the policy had initially planned on three years (De la Cruz Toledo 2015).
On the supply side, capacity is one concern with making a policy compulsory. Even policies to extend the primary school day for existing schools have faced implementation challenges and expansion occurred over several years (Berthelon, Kruger, and Oyarzun 2015;Padilla-Romo and Cabrera-Hernández 2018). In Chile, expansion of the number of schools offering the longer school day was fastest in rural areas that did not face as high space constraints or costs of expansion. Thirteen years after launching the extension of primary school from part-time to full-time, coverage was only 66 percent of total primary school enrollment (Berthelon, Kruger, and Oyarzun 2015). Similarly, in Brazil childcare is a constitutional right, but the capacity for all children whose mothers desire care has not yet been reached (Sanfelice 2019); preschool enrollment was still under 40% for eligible children (Ryu 2020).
The length of the preschool day is important if made compulsory. If the preschool is only part-day and is compulsory, this could have a negative impact on mothers' labor supply if they substitute away from informal care. Mothers may then work fewer hours in order to coordinate work with the preschool schedule (De la Cruz Toledo 2015). In contrast, in Brazil, Ryu found that women who were already working increased their hours when preschool was mandated (though not enforced) (2020). As with public primary school, compulsory preschool would also need to be provided free if policy makers are to accommodate the financial constraints of the population.
However, when compulsory enrollment can be enforced, evidence from Norway's lowering of its school starting age from 7 to 6 may be instructive (Finseraas, Hardoy, and Schøne 2017). 18 Access to subsidized publicly provided childcare was limited and not free. Meanwhile, compulsory primary schools were free of charge and came with after-school care on school premises. This guaranteed parents of primary school-aged children access to full-time childcare. Exploiting such unexpected windfall of free childcare as a natural experiment, the authors find that this policy leads to increase in maternal labor supply and earnings, especially among mothers with low-wage potential. Ranganathan and Pedulla (2021) suggest that making preschool compulsory may help women overcome pressures associated with social norms that favor maternal over institutional care and contribute to ensuring take-up. However, the magnitude of the estimates in the papers reviewed here do not differ dramatically for compulsory vs. voluntary childcare. Making preschools compulsory is unlikely to have dramatically different impacts on maternal labor supply than simply increasing their availability to households.

DIRECTIONS FOR FUTURE RESEARCH
While the plethora of research from higher-income countries may offer some guidance to the existing gaps in knowledge, more research highlighting contextual challenges in low-and middleincome countries is needed still. Market failures that underpin reasons for government interventions on childcare-such as household liquidity constraints and information failures-are likely more pronounced in developing than in higher-income countries. Liquidity constraints facing many households in emerging economies likely lead to an undersupply of privately provided childcare. Devercelli and Beaton-Day (2020) estimate that 350 million children worldwide lack access to quality childcare, and 80% of those live in low-and middle-income countries. Meanwhile, quality standards are less enforceable in developing countries-rendering information on childcare quality, when it is available, more noisy and less trustworthy.
Our review suggests that while there is a solid consensus in the literature on the positive impact of childcare on increasing maternal labor force participation, we know much less about the impact of childcare on other labor market outcomes (such as hours of work, types of job, formal or informal sector, wages and income, or self-employment). The relationship between childcare and type of work may be particularly important. For example, correlational evidence from Uganda finds that female-owned businesses where children are present earn 48 percent less profit than similar businesses without children in the space (Delecourt and Fitzpatrick 2020). Analyzing the relationship between childcare and these other labor market outcomes in the future -with specific focus on implementation features of childcare services, such as hours, public or private, compulsory or not -would enrich the literature and provide relevant insights for policy makers. There are several other underexplored areas of inquiry, which we list below.
First, future research should examine the impacts of childcare on women's well-being because mothers' entry into the labor market, if not accompanied by changes in attitude or distribution of household and family care responsibilities, might mean that women end up shouldering increased responsibilities between the workplace and work at home. This may lead to higher levels of stress. This may be the case in Ecuador, where Rosero and Oosterbeek find an increase in mothers' stress and depression associated with childcare provision (2011). However only two other studies reviewed include outcomes that capture non-monetary well-being. Martinez and Petricara (2017) find no impact on mothers' stress levels, while Angeles et al. (2011) find no impact on mothers' psychological measures. Explanation for the null findings may be that, in many settings within developing countries, the stress of poverty may outweigh the stress of work. This would be consistent with evidence from Brodeur and Connolly (2013) in Quebec, who find that subjective well-being increased for lower-educated mothers and fathers and decreased for higher educated parents as a result of access to childcare.
Another relatively under-researched area is the role of other household members in provision of informal childcare. Some papers in this review suggested informal care provided within the home was a reason that mothers' labor response to the provision of institutional childcare was limited. Some mothers did not need to rely on institutional care because other household members already provided care to their children. In the worst-case scenario, these may be siblings who miss out on school or other opportunities due to care responsibilities (Jakiela et al. 2020). More commonly, grandparents provide informal care to young children, allowing mothers to work. Husain and Dutta (2015) find maternal grandparents' support is important to mothers' labor market decisions. Du, Dong, and Zhang (2019) find that access to grandparental care (as instrumented by proximity to healthy grandparents) increases MLFP by around 40 percent while access to daycare services increases MLFP by less than 30 percent.
However, many grandmother-caregivers are still of ages in which they could be participating in the labor force. For example, research from Chile finds that the average age of grandmothers coresiding with grandchildren under the age of 5 is 53 years (Reynolds et al. 2018). Although the older generations typically have lower labor force participation rates than younger generations, as current mothers age and become grandmothers themselves, they (and their daughters) may face difficult decisions about who leaves the labor force to care for children. These concerns indicate the need to include labor outcomes of a broader variety of actors in the literature on childcare. For instance, the impacts of institutional childcare on female labor supply may be understated if institutional childcare allows informal caregivers to participate in the labor force.
In addition to questions regarding institutional childcare's impact on other females in a household, important questions remain regarding the impact of childcare provision on male labor market outcomes -with one notable exception. Calderon (2014) finds that men work less when childcare is provided in Mexico because mothers now contribute to household income. Some of these men 22 opt for an unemployment spell in order to switch to better-paying jobs, but others chose more leisure.
Lastly, intra-household bargaining is at the crux of the childcare and labor market decision. Several papers suggest its importance, while at the same time acknowledging important knowledge gaps in this area. As more mothers enter the labor force, intra-household gender dynamics should be studied, since having independent income may improve mothers' bargaining position in the household -although it may also put women at risk of violence as a result of male backlash to inverted gender roles (Buller et al., 2018). In Kenya, it was somewhat surprising that childcare in urban slums did not impact measures of women's autonomy with respect to decision making with their husbands despite higher levels of employment (Clark et al. 2019). In Ecuador, access to childcare increased household, but not maternal, income (Rosero and Oosterbeek 2011). These disparate findings suggest that to fully understand how to maximize the potential of childcare services to improve child, maternal, and household outcomes, researchers and policy makers need a better understanding of how intra-household bargaining affects childcare, domestic and nondomestic labor.

CONCLUSION
Research from lower-and middle-income countries overwhelmingly establishes a positive causal link between childcare and maternal labor market outcomes. Of the 22 studies we reviewed, all of which have strong empirical analyses, 21 find statistically significant increases in maternal work and/or other labor market outcomes resulting from an increase in access to care, an increase in care hours, or a reduction in the cost of care. Only one study does not find consistently significant impacts of a childcare program targeting low-income families in Chile (Medrano 2009). The results of this review are encouraging that childcare can support lifting women's labor market engagement in the developing world. However, while current research overwhelmingly points towards positive impacts of institutional childcare on maternal labor force participation and work, evidence on the intensive margin and on final outcomes associated with participation in the labor market, such as maternal and household incomes, is less conclusive. First, these outcomes are not consistently explored across studies -possibly due to data limitations. Second, while some studies find positive impacts on earned income (Barros et al., 2011;Clark et al., 2019;Padilla-Romo and Cabrera-Hernandez, 2019) and evidence of switching to more productive jobs (Dang, Hiraga, and Cuong 2019), others suggest that increased labor market engagement is driven by low productivity work, such as unpaid family work (Halim, Johnson, and Perova 2019), and find no evidence of significant increases in maternal income (Martinez and Petricara, 2017).
Available evidence suggests that the strength of impacts of childcare on maternal labor market outcomes beyond labor force participation may depend on specific features of childcare service design and implementation. Specifically, our review suggests that designing childcare services only with the objective of early childhood development in mind may not realize the full potential of institutional childcare in improving maternal labor market outcomes. The hours of operation of childcare facilities appear to matter, for example. The collection of reviewed papers also highlights challenges with provision of childcare for younger children (under the age of 3). For example, improving maternal labor outcomes through provision of childcare services for very young children would most likely need to tackle additional problems of social norms, trust, and higher costs. Lastly, our review emphasizes the importance of carefully considering local context when targeting childcare services to specific income groups. In contrast to high-income country settings, poorer and/or less-educated mothers in low-and middle-income countries do not necessarily always reap higher benefits of public provision of childcare, a result that may be driven by a paucity of labor market options for these women.
Future research can prioritize examination of the relationship between these and other features of childcare services and labor market outcomes beyond labor force participation and work. Future research could also focus on non-monetary welfare outcomes of mothers, and on the impacts of different childcare arrangements on other household members, including grandparents, fathers and older siblings.
The COVID-19 pandemic brings questions about childcare and FLFP to the fore, as school closures and economic contractions have often resulted in women reducing their employment and/or shifting into informal work (Biscaye, Egger, and Pape 2021;O'Donnell et al. 2021). As a result, some policy makers have developed responses directly pertaining to childcare provisionas in case of the United States, for example (White 2021). 19 We hope that this review, which summarizes available evidence and highlights design features of institutional childcare that can facilitate maternal labor market engagement, may be useful as policy makers tackle the COVID-19-induced domestic care burden, as well as the broader challenges associated with mothers' engagement in the labor market. Argentina Berlinski & Galiani 2007 In 1993, the Federal Ministry of Education started a large infrastructure program aimed at expanding school attendance for children aged 3-5. Between 1994 and 2000 the school construction program allowed for an 18 percent increase in pre-primary school enrollment. Chile Medrano 2009 The Presidential Advisory Council for Infant Policies, Michelle Bachelet, prioritized preschool for low-income children, with the objective of improving early childhood development for disadvantaged populations. Women's labor force participation was one of the highest among developing countries under Mao. However, in 1978, China began a transition from a planned to a market economy. Emphasis in childcare also focused on education, reducing focus on women working. Public care for children ages 0-2 years was canceled. This resulted in an overall reduction in childcare provision in urban China, in spite of an increase in private care.

Mexico
De la Cruz Toledo 2015 Mexico passed a law in 2002 to expand three years of preschool education with the goal of all children ages 3-5 attending compulsory preschool by 2008. However, this was relaxed to children having to attend only 1 year of preschool prior to enrolling in elementary school. As a result of these policies, preschool enrollment increased by 23 to 47 percent for three-year-olds and by 69 to 100 percent for four-year-olds between 2004 and 2012. Mexico Calderon 2014 In 2007, Mexico launched Programa Estancias Infantiles (PEI -Child Care Centers Program), which provided access to childcare for children aged 1 to 3 years old to women or single fathers, who were working, looking for a job, or studying. Notably, facilitating entry into the labor market was an explicit goal of this program. In Mexico, Estancias Infantiles (daycares) which provide childcare services for children aged 1 to 3 Indonesia Halim, Johnson & Perova 2019 Following the tumultuous financial crisis of 1997/98, the government recognized the importance of education-including early childhood education-to build back the economy and sustain the high rates of economic growth experienced prior to the crisis. The National System of Education Act of 2003 adopted early childhood education as part of the national education system. Subsequent legislation in 2005 required 20 percent of the national and regional governments' budget to be allocated to education expenses ensuring that early childhood educational development expansion is financed. Chile Berthelon, Kruger & Oyarzun 2015 In the second half of the 1990's, Chile initiated a large-scale education mandated that all primary and secondary schools that receive public funds-municipal or private subsidized-must offer a full-day program by 2007 and 2010, respectively. The average time spent at school increased by about 35 percent in primary schools without increasing the number of days in the academic calendar. Expansion was more rapid for rural than urban schools since urban space were space constrained and often used the same infrastructure for two sets of half-day students.

Mexico
Padilla-Romo & Cabrera-Hernandez 2018 Primary school day expansion that extended the school-day from four and a half to eight hours for children ages 7-11. The expansion occurred from 2007-2016, reaching 25 percent of all primary schools.   Berlinski and Galiani (2007) 0.142* New stock: number of preschools, multiplied by 50, normalized by cohort size.