Labor and Welfare Impacts of a Large-Scale Livelihoods Program: Quasi-Experimental Evidence from India

Improving the livelihoods of poor households and transitioning more women back to the labor force is a major challenge in South Asia. Self-employment promoted through women's groups has often been cited as a promising intervention towards this end. However, the evidence on the impact of such programs on household income and labor outcomes is limited, especially for government programs like the National Rural Livelihoods Mission in India. This study aims to provide empirical evidence on the welfare impacts of an "intensive approach" adopted under this program. The data for the study come from 4,316 household surveys in 727 villages. The study uses matching methods with the population and socioeconomic census, as well as an instrumental variable approach to construct a retrospective control group. The analysis finds that the program has been able to achieve its primary objective of improving livelihoods by transitioning more women into work. The program has also expanded access to credit, increased the proportion of savings, and reduced interest rates on credit for rural households. This is the first study to estimate the annual income effects of a government-run rural livelihoods program in India, and it shows significant increases in median income across the sample. The results for 30th, 40th, and 75th percentiles are also large and significant. However, the study did not find significant average treatment effects for income. Contrary to previous studies, this study finds weaker impacts on assets, except for livestock.


Policy Research Working Paper 8883
Improving the livelihoods of poor households and transitioning more women back to the labor force is a major challenge in South Asia. Self-employment promoted through women's groups has often been cited as a promising intervention towards this end. However, the evidence on the impact of such programs on household income and labor outcomes is limited, especially for government programs like the National Rural Livelihoods Mission in India. This study aims to provide empirical evidence on the welfare impacts of an "intensive approach" adopted under this program. The data for the study come from 4,316 household surveys in 727 villages. The study uses matching methods with the population and socioeconomic census, as well as an instrumental variable approach to construct a retrospective control group. The analysis finds that the program has been able to achieve its primary objective of improving livelihoods by transitioning more women into work. The program has also expanded access to credit, increased the proportion of savings, and reduced interest rates on credit for rural households. This is the first study to estimate the annual income effects of a government-run rural livelihoods program in India, and it shows significant increases in median income across the sample. The results for 30th, 40th, and 75th percentiles are also large and significant. However, the study did not find significant average treatment effects for income. Contrary to previous studies, this study finds weaker impacts on assets, except for livestock. This paper is a product of the Agriculture Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at agupta20@worldbank.org.

1 Introduction
Across the globe, women face unequal economic opportunities and constrained choices in their income generating activities; the situation is much worse in South Asia, especially India (Chaudhary and Verick 2014). Even the challenges of poverty are borne by women inequitably (Sen 2001). In rural India, women from poor households face additional challenges of lower education levels and lesser ability to contribute to intra-household decision making. Better economic opportunities and enhanced employment are often cited as key pathways for reducing gender inequality (Gradín, del Rio, and Cantó 2010). The factors that make it very difficult for rural women to find salaried jobs primarily include low education levels (skills and training) or to initiate entrepreneurial activity due to credit access (Islam andPakrashi 2014, Boucher, Carter, andGuirkinger 2008), and disempowerment in the community (Gillespie 2004) etc.).
Thus, when women with the potential of being employed or the potential to undertake entrepreneurial activities, are unable to do so, this leads to low female labor force participation (FLFP) in India, which then slows down the growth of the rural economy. (Chaudhary and Verick 2014). From 1972 to 2013, FLFP in India has declined from 31.8 percent to 26.7 percent (Sanghi, Srija, and Vijay 2015, Government of India 2011b, International Labour Organization 2013. Similar trends of decline have been observed in the rural workforce participation ratio (WPR) as seen in the last population census (Government of India 2011a) from 30.79 percent in 2000-01 to 30.02 percent in 2011. Furthermore, there is almost a 50 percent gap between male and female participation rates.
According to Chatterjee, Murgai, and Rama (2015), the most important reason for the decline of the rural FLFP in India has been reduction in agricultural jobs despite the commensurate emergence of other employment opportunities. The need to design policies and programs to enhance the economic and livelihood activities of women is more urgent than ever, and reduction in gender inequality and increased female empowerment has clear and significant economic benefits (Duflo 2012). Recently, policy makers from all over the world have focused on promoting entrepreneurship, with a specific focus on women entrepreneurs. It is assumed that entrepreneurship can transition more women into the labor force through self-employment and indirectly by creating livelihood opportunities for other females, as women entrepreneurs are more likely to hire more female workers (Ghani, Kerr, and O'Connell 2014).
Until recently, micro-credit programs were among the most prevalent interventions in many countries that were expected to promote poor households into becoming microentrepreneurs and enhancing their business activities (Arouri et al. 2014). Hence, large resources have been, and continue to be channeled into programs and companies providing small loans, with the expectation that the poor would buy productive assets and grow their small businesses. However, evidence from most micro-credit experimental studies has not been successful in finding significantly favorable results on income, assets and profits of enterprises (Banerjee, Karlan, and Zinman 2015, Banerjee 2013, Kaboski and Townsend 2012, except a single study in Uganda (Fiala 2013).
More recently, experimental evidence has emerged that suggests that instead of providing loans, transfer of direct unconditional grants can sufficiently encourage entrepreneurial activities. Recent studies have found large and significant impact of unconditional grants to poor households on their income , Baird, McIntosh, and Özler 2011, Haushofer and Shapiro 2016, Blattman, Fiala, and Martinez 2013, Arnold, Conway, and Greenslade 2011 but the long-term effects of these grants are still not clear (Haushofer and Shapiro 2018). Similar studies in other parts of world have also found positive outcomes of providing cash transfers with alternative combinations of business training and other capacity building activities (Blattman et al. 2014, Fafchamps et al. 2011, Macours, Premand, and Vakis 2012, De Mel, McKenzie, and Woodruff 2008. Lastly, a multi-sectoral approach has been recently adopted by a few programs (more specifically, in South Asia), as they combine micro-loans and savings, asset transfers, business training, and social networks into one set of interventions. The majority of these programs target women and are implemented through women-only community groups such as self-help groups (SHGs). It has been argued that the integrated approach is more likely to generate sustainable impacts on female labor force participation, livelihoods, income and assets and other indicators of household welfare. Several empirical studies have estimated the welfare impacts of such community based multi-sectoral rural livelihood interventions (Datta 2013, 2015, Parajuli et al. 2012, Prennushi and Gupta 2014, Deininger and Liu 2013b, a, Desai and Joshi 2012, Hoffmann et al. 2017, Khanna, Kochhar, and Palaniswamy 2015. Most of these report significant effect on size of the asset and skills development but impacts on consumption and income are more elusive. A recently published six country study shows that the multi-sectoral approach to rural livelihoods when combined with assets transfer instead of only micro-loans can have positive returns and long-term effects on household income and assets (Banerjee et al. 2015).
Despite the large-scale and continued presence of community development and rural livelihoods projects over several years, evidence of the effects of such programs on key economic outcomes like income, labor force participation and seasonal migration remains limited. Policy makers have often cited that "policy initiatives focused on microfinancesupported self-help group-centered activities are required to make females economically active along with handling domestic duties and help to address the need for working finance as expressed by the females willing to accept work" (Sanghi, Srija, and Vijay 2015).
To date, only two studies have attempted to examine the impacts of livelihood interventions on female labor outcomes in India. Both the studies focus on initiatives that were implemented by SEWA (Self-Employed Women's Association) in Gujarat (Ahmedabad) and rural Rajasthan and reported mixed results. The Ahmedabad study has shown positive results for the female labor participation rate (in the longer-run), especially through increased household business activity (Field, Martinez, and Pande 2016). The rural Rajasthan study concluded that there were negligible differences in female labor participation attributable to the initiatives, but could not conclusively disentangle the effect of intervention from the confounding impact of Mahatma Gandhi Rural National Rural Employment Guarantee Act (MGNREGA) (Desai and Joshi 2012).
This study seeks to add to this nascent literature by providing empirical evidence of short to medium run impacts of the intensive model of the National Rural Livelihoods Mission (NRLM), the largest multi-sector rural livelihoods intervention, 3 at two levels: (i) intermediate outcomes such as labor force participation, savings, access to loans and migration; and (ii) final outcomes such as assets, entrepreneurship, and household income. The study also reports distributional impacts of the intensive model of NRLM in order to account for specific design features that include focus on poor and landless rural households.

2 Setting and Intervention
The beginnings of NRLM date back to late 1980s, when several NGOs and development agencies such as UNDP, World Bank, DFID etc. launched women-focused community rural livelihoods programs in southern states of India such as Andhra Pradesh, Kerala and Tamil Nadu. Owing to successful pilots of these programs, several state governments in south India scaled-up these programs either through state budgets or through external donor funds. The second wave of such programs was then initiated in northern and eastern states and a few even reached significant scale such as Bihar. These successes demonstrated that such intensive government-run programs could be implemented in challenging institutional contexts. Encouraged by these developments, the Government of India, in 2011, phased out the long-standing but struggling Swarnajayanti Gram Swarozgar Yojana (SGSY) program (Commission 2005) and replaced it with NRLM. The new national program is unique as it adopted the state-specific approach that builds on the history of the state community livelihoods program. This approach is expected to ensure continuity of efforts and introduce programmatic innovations that can address local constraints and challenges.
In 2011 the Government of India, with support from the World Bank, also piloted an `intensive' approach of the already successful state models in 584 blocks through the National Rural Livelihoods Project (NRLP). 4 The `intensive' approach employed additional teams in the field at the sub-district level to mobilize poor women into self-help groups. These groups are then provided with seed funds and linked with various commercial banks for low-cost credit. As the groups mature, participants are provided: 1) trainings in social and economic skills (such as group management, negotiating skills, and financial management); and 2) assistance to access a range of other government programs. In addition, federations of SHGs are set up at various levels, and common economic interest groups are mobilized (popularly called producer groups, PGs). Members of these PGs are provided advanced trainings in specific sectors such as agriculture, livestock, and in setting up of their own business.
The theory of change for this intervention is that through formation of institutions of the poor (social networks) through social mobilization and greater access to low-cost credit, combined with trainings, the program will provide women with improved agency. This will facilitate retiring of high-cost debts and reduction of vulnerability. As a result, smoothing of household consumption will take place along with improvements in investments in productive assets. It is also anticipated that SHGs and their federations will help women to strengthen their social networks and improve their intra-household bargaining power and demand greater accountability and response from government and private services. Owing to its unique features, it is also expected that the program will enable livelihoods diversification at the household level and help expand business activities in rural areas. Eventually more women will be able to participate in the labor market and improve household income. Over time, it is envisaged that these community groups, and their federations would gradually become, in the program's words, an `institutional platform of the poor', and strengthen the voice and negotiating power with the markets of small and marginal producers. See Figure 1

Roll-out of NRLP
Since 2011, NRLP has mobilized more than 8.8 million women from poor rural households into self-help groups in its 13 focus states. The majority of these women belong to Scheduled Castes, Scheduled Tribes and other vulnerable households. At the time of canvassing of the household survey, the program had completed the first set of trainings for almost all the beneficiaries. By 2016, the project succeeded in universally implementing its first-order interventions of mobilizing more women into groups in the treatment areas, providing low cost credit and basic training. According to project monitoring estimates, private financial institutions have disbursed approximately US$640 million to the NRLM SHGs. The most recent data indicate that approximately US$300 million has been provided as community grants as a revolving fund (World Bank 2018). 6 As per current estimates, approximately INR 6,200 (US$ 95) per household has been given through community grants since 2011 in the three states of Jharkhand, Maharashtra and Madhya Pradesh that we study here. 7

Data and Identification Strategy
The data for this study come from community and household surveys in three states of India -Jharkhand, Maharashtra and Madhya Pradesh. These are states where the NRLP has been implemented at scale. The survey was canvassed during November 2016 -February 2017. Table 1 below has the summary of the final sample. 6

Sampling Strategy
The survey followed a multi-stage sampling strategy at the block, village and household levels. We first identified potential treated blocks where the program takes up rate was at least 50 percent. Then we listed potential comparison blocks in consultation with the National and various state level rural livelihood missions as it involved identification of those blocks where the program had just entered in 2016 or was about to take off in the following year. The treatment and comparison blocks were randomly selected from the list of potential blocks. 8 This block selection strategy reduced the program placement bias to some extent at the sampling stage itself.
After block selection, propensity scores using standard logistic specification were estimated in order to construct the group of matched treatment and control villages using village-level variables from the pre-intervention period. Variables such as literacy, availability of infrastructure, composition of caste, access to land, remoteness and indicators associated with household amenities, were drawn from the Socio Economic and Case Census (SECC)-2013. We used data on other village-level characteristics such as the Population Census 2011. Several variables were identified in consultation with state missions because implementation agencies have recently used these for identification and mobilization of target groups. The SECC data were given priority over the census data wherever data on a characteristic were available from both sources, as household level data were available for the SECC and it was a more recent survey. The choice of other variables in the village selection model was based on the literature review of similar interventions. Results from the village selection model for each state are given in the Annex (Tables A1-A3).
Once the matched list of treatment and control villages was identified, 6 households were randomly selected from each village from the SECC's household list. Due to the nature of the rollout of the program, the sample was not evenly distributed across states. Program placement bias owing to state specific rollout heterogeneity was therefore addressed by specifying the village selection model at the state level. However, the household selection model was proposed for the entire sample. This was done to mimic power calculations and ensure availability of a larger group of comparison households to match from.

Survey Data
In this section we present the summary statistics on key characteristics of the surveyed sample.   mention that two methods are primarily used to address the counterfactual issue in observational studies. These include Propensity Score Matching (PSM) and Instrumental Variable (IV) Method. PSM has become increasingly popular in medical trials and in the evaluation of policy interventions (Becker and Ichino, 2002). In observational studies assignment of subjects to treatment and control areas is not random. Therefore, the estimation of the program effect is expected to be inconsistent due to existence of confounding factors. PSM corrects the estimation of treatment effects to the extent one can identify confounding factors. The bias is reduced when outcome variables are compared for treated and comparison group units that are as similar as possible, in the absence of treatment (Becker and Ichino 2002). However, the PSM approach has serious limitations in fixing the biased estimates of the program effect because unobserved factors that may influence program participation may also be related with the program outcomes. So, we have used the instrumental variable method to check the robustness of the PSM results.

Propensity Score Matching
We estimate the emerging impacts (Intention-to-treat, ITT) of NRLM on household welfare outcomes such as income, assets, expenditure etc. (Godtland et al. 2004). Owing to nonrandom allocation of the program to villages and household self-selection, simple comparison of outcome variables will give us inconsistent estimates of program impact. Based on program design and rollout strategy, there are multiple sources of self-selection and program placement bias. For instance, NRLM is a demand driven program and hence households exposed to the treatment can be systematically different from those who did not choose to participate in the program. In that case, it is quite likely that the differences in outcomes are due to pre-program differences. In addition to selection bias, there could also be program placement bias.
We use propensity score based matching methods for estimating NRLM impacts. The propensity score matching (PSM) method is used for estimating the average program effects, while Inverse Probability Treatment Weights (IPTW) for heterogeneous outcomes 9 and Quantile regressions for distributional welfare effects have been used. Instrumental variable specification is used as a check for average program impacts.
We use PSM to construct a comparison group as an estimate for the counterfactual outcome (Heinrich, Maffioli, and Vazquez 2010). Following Rosenbaum and Rubin (1983) we match households using propensity scores estimated by the selection model. The treatment status is regressed on a set of pre-determined household and exogenous village characteristics.
The selection model includes variables that are drawn from the literature as well as the strategic and implementation guidelines of NRLM such as proportion of SC and ST 9 As heterogenous results were weak, they have not been reported in the paper 8 population, poverty rate(proxy indicators) 10 , and village remoteness indicators such as distance to nearest district headquarters. Other variables relate to socio-demographic characteristics such as age, caste, dependency ratio, completed years of education, and characteristics of the household head. Data for the selection model were drawn from primary household and village surveys as well from Census 2011 and SECC 2013-14. In line with Khanna, Kochhar, and Palaniswamy (2015) we also use retrospective indicators of households such as ownership of amenities, income, land and dwelling characteristics. Village variables include population density, percentage of female working population, infrastructure facilities (such as road, primary school, primary health centers and agriculture marketing societies), percentage of household with amenities (electricity and bicycles), and village size.
The two primary assumptions of PSM: (a) Conditional Independence (CI); and (b) common support (CS) are then applied. CI implies that NRLM outcomes are independent of the treatment status of households in the absence of treatment, conditioned on observables that include household and village level covariates. The conditional independence condition is expressed as: where Y D and Y C represent outcomes for participants and non-participants, D i is the treatment status and X i are the observables (Khandker 2010). Applying this, we calculate the propensity scores as in Table 2.   CS assumption ensures that conditional on X i , treated units have neighboring comparison households in the propensity score distribution (Heckman, LaLonde, andSmith 1999, Khandker et al. 2009). This then implies: 0 < P (Di=1|Xi) < 1 11 Figure 2: Area of Common Support: Kernel Density Estimate 11 Figure 2 shows that there is a satisfactory overlap in the propensity score distribution across program and non-program villages. For our data, the region of common support is given by (0.042, 0.929). We use Kernel algorithm to match households in program and control areas. This method compares outcomes for the treated sample with weighted contributions from all the households in the control areas in order to form an estimate of the counterfactual outcome (Lance et al. 2014). A comparison group household with the closest propensity score to the program area household receives the highest weight, while cases with differing propensity scores receive smaller weights, resulting in "smoothed" weighted matching estimators called bandwidths (Titus 2007). Frölich (2004) asserts that among different matching algorithms, the kernel matching produces the most precise estimates.
The design and implementation of NRLM places emphasis on targeting vulnerable rural households which includes disadvantaged social groups by caste and endowments. Therefore, we anticipate that NRLM will have a cumulative effect, that is, over and above the direct program effects, on these groups of households. Following Khanna, Kochhar, and Palaniswamy (2015) and Chen, Mu, and Ravallion (2009), we estimate the heterogeneous impacts on landless and schedule caste (SC) households using propensity-weighted ordinary least squares. 12 Hirano, Imbens, and Ridder (2003) show that the use of propensity scores in calculating the weights will balance the covariates and results in 12 efficient estimates. We use the following specification to estimate the heterogeneous effects of the program: where P i is the treatment status of the i th households; and V i is the dummy variable that i th household belongs to a vulnerable group (that is, landless households and SC households), X i is a vector of household and village variables, and u i is the random error term. The parameter of our interest is , which is a difference-in-difference type estimate of the heterogeneous impact of the program. The above specification uses propensity weights which are computed in the following manner as suggested by  where P i is the propensity score for the i th household.
It may be the case that the program generated average welfare effects but whether these were distributed across household segments by program outcome(s), is not known. We use quantile regressions to account for this heterogeneity of treatment effects in the distribution of outcome variables, specifically with respect to household income and borrowings. Quantile regressions are well suited to estimate the impact of program over the whole distribution of the program outcome. For example, if participating households are categorized into different groups based on a specific program outcome variable, the quantile regression will estimate which outcome variable group has benefitted most from the program.
Formally the quantile regression equation for the distributional effects of NRLP on welfare outcomes Y of i th household can be expressed as follows: where denotes the quantile of outcome Y, X is a vector of exogenous variables, P is the program participation variable, is the quantile treatment effect (QTE) such that: where 1 and 0 are the th quantile of Y 1 and Y 0 where Y 1 and Y 0 are the outcomes for participants and non-participants. We use a weighted quantile regression approach proposed by Sherwood, Wang, and Zhou (2013). The inverse probability weighted quantile regression estimator is a semi-parametric method consistent and asymptotically normal. The weighted quantile regression estimator is defined as: Details of the derivations of weighted quantile regressions are provided in Sherwood, Wang, and Zhou (2013).

Instrumental Variable Regression
PSM provides consistent impact estimates under the Common Independence Assumption. This is a very strong assumption because PSM may have the potential to control for 13 observed sources of confoundedness but unobserved factors that influence program participation may also be related with the program outcomes. For example, some households are more entrepreneurial than others, and this may influence their willingness to participate in village organizations that are supported by NRLM as well as welfare outcomes. Such a possibility will render the estimated program impacts biased and ultimately inconsistent. So, we have used instrumental variable regression estimates to establish the robustness of the PSM estimates.
We have identified suitable instruments (Z's) that can influence a household's decision to participate in the program, P (that is, � ′ � ≠ 0) but not the error term u i (that is, � ′ � = 0). In other words, instrument(s) have been chosen in a way that they are correlated with the likelihood of program participation but have error processes independent from the error processes that determine the household's decision to participate in the program and welfare outcomes. We use the two-stage instrumental variable approach with the following specification: The empirical specification outlines the underlying processes through which households choose to participate in NRLM and its impact on their welfare outcome. Equation (3) is the reduced form specification where Z k contains k instrument(s), X i is a vector of village level exogenous variables. The estimated coefficients from equation (3) are used to predict participation for each household in the sample. The predicted participation � enters the second stage equation (4) as an exogenous variable.
Selecting appropriate instruments is central to the IV estimates. The instrument variable(s) as mentioned above should influence the participation decision of the household but itself is exogenous to the entire process. As NRLM is rolled out in a village, a household has to make a conscious choice to become member of the village organization and realize several benefits including availability of micro-credit through the revolving fund. One of the major thrust areas of the program is to promote micro-enterprises in villages. Villages have a history of engaging in business activities through village enterprises such as tea shops, tailoring, blacksmith, eating houses, bakery, petty shops, etc. The extent of business activities inside a village is determined by a set of processes, both observed and unobserved, those that were determined and set in motion in the past and hence are exogenous to the program implementation. We compute the Village enterprise index or a Simpson's index. 13 Given that the village enterprise index is a village level variable and is determined historically, it can be safely inferred that the index is not correlated with households' unobserved characteristics and outcomes. We compute the Village Enterprise Index for the year 2012, the same year when the program was first rolled out in Indian villages.
Another instrument that we use is size of total village land. NRLM specifically targets landless households and the size of village land is likely to influence program participation. However, village level variation in land size is likely to be uncorrelated with unobserved 14 household characteristics and with regard to welfare outcomes today. This claim is likely to hold because land settlement patterns and the size of villages, particularly in rural India, have remained essentially unchanged for centuries (Anderson 2011). However, many villages have been merged with the nearest cities and towns. We take the size of village land net of own land to generate additional source of variation in the sample.
In the table below, we report the representative first stage results of the instrumental variable regression. The results in Table 4 indicate that both the instruments satisfy the first condition of � ′ � ≠ 0, that is, both the instruments are significant, and they are also jointly significant with a high F-statistic implying the strength of the instruments. We also test instrument exogeneity and instrument relevance. The first one tests if the instruments are uncorrelated with the error term of the outcome equation, that is, � ′ � = 0. This test is possible when we have at least one more instrument than the number of endogenous variables, that is, an over-identified system, which is the case here. The test is implemented by Hansen J statistics, distributed as chi-square. The size of test statistic (0.905) and the pvalue (0.34) indicate that we fail to reject the null hypothesis of no correlation with the error term of the outcome equation. The test of instrument relevance checks whether instruments are correlated with an endogenous regressor of the outcome equation. The test is implemented by the Kleibergen-Paap LM statistic, distributed as chi-square under the null hypothesis that the equation is under-identified, and the instruments do not influence the endogenous variable, that is, program participation. Results in Table 4 suggest that we can easily reject the null hypothesis of irrelevant instruments. Another test is that of weak instrument which is implemented by the Cragg-Donald Wald statistic, distributed as Wald F-statistic. We compare the F-statistic with Stock-Yogo's critical value defined for an IV bias that is 5 percent of the OLS bias. Therefore, we can reject the null hypothesis that the bias of IV estimate due to a weak instrument is greater than 5 percent of the corresponding bias in the OLS estimate. An F value of 11 or higher is considered sufficient to reject the null hypothesis (Khandker et al. 2014). Table 4 shows an F value of 84.85, implying that the instruments pass the weak identification test. Finally, Table 4 also reports the results of the endogeneity test by using the Wu-Hausman statistic, distributed as F and the null of exogeneity of program participation can be easily rejected.

Key findings
The underlying hypothesis of these livelihoods programs, including NRLP, is that participation in community groups (such as self-help, producer groups etc.) when combined with trainings, savings and loans would lead to increased mobility, more access to economic resources and stronger social networks for women, eventually empowering them, and resulting in higher labor force participation. We find that a number of these objectives are being realized and are summarized below.

Labor Outcomes
There is already a significant body of literature to demonstrate that participation in such programs has led to higher levels of empowerment, more favorable intra-household decision-making for women and higher female mobility Rao 2017, Sanyal, Rao, andMajumdar 2015), but no previous study has measured female labor force participation or worker participation ratio. 14 The presence of a comprehensive member level livelihoods profile in this survey gives us an opportunity to fill this critical gap.
We begin by looking at three aspects of labor outcomes. First, we analyze if households and female members have diversified their livelihoods. Then we inspect the nature of this diversification and lastly, we examine if there are any meaningful changes in higher level outcomes such as labor force rate or worker participation rate.
We start by testing the assumption that program participation is leading to household members, especially women, taking-up more livelihood activities. There is overwhelming anecdotal evidence from project monitoring to support this assumption and our data are consistent with it. The data suggest a large and significant increase in the number of livelihoods being pursued by households (see Table A 7 for the full results). In the overall sample, the number of livelihoods has gone up almost 20.4 percent among the treatment households (with average livelihoods of 3.8 activities per household in the treatment areas).
Most of this increase is due to the increase in the number of livelihoods of female members (38.5 percent higher in treatment areas). For female members of productive age (15-64 years old) the results are similar (33 percent higher in treatment areas).
Next, we inspect the nature of livelihoods activity that is causing this increase. Again, field notes and project monitoring suggest that program beneficiaries are moving away from being daily wage laborers (casual labor) to self-employment, thus initiating more businesses. The survey data confirm the same. The increase in livelihoods is predominantly due to the large and significant increase in the number of self-employment livelihoods activities in farm (5.4 percent more women employed) and non-farm activities (0.7 percent more women employed) with an overall increase of 5.8 percent. Most of this increase is coming from households moving away from casual farm labor towards self-employment such as farm and non-farm businesses. Furthermore, the results indicate that there is a 8.4 percent increase in the number of farming households who have transitioned to high-value agricultural crops. 15 This is another indicator of improved diversification and additional source of employment opportunities. Lastly, the data also indicate a small but significant increase in the number of formal jobs within the households. It is possible that this increase could be because of the program's focus on skills and job placement (in select areas) or because of more businesses starting and employing more people.
Last, we analyze whether the increase in the number of livelihoods and diversification has influenced women's participation in the labor force. The survey data allow us to look at the work participation rate (WPR) 16 and we focus on the female household members as they are the primary beneficiaries of the program. In 2011, as per the population census data in the sample areas, there was almost no difference in the treatment and control villages(see Table A 5) 17 . However, in 2016-17 we estimate that there is a 5.5 percent increase in overall WPR in treatment areas and an even higher increase of 7.7 percent among women of productive age, when compared to control areas. Interestingly, the increase in the paid livelihoods amongst adult women almost the same at 7.3 percent in treatment areas when compared to control areas.
This is a significant result from a policy point of view for two reasons. First, the increase in WPR is for the entire village and not just for program beneficiaries. Second, results seem to suggest that, in treatment areas, the program might have been able to arrest the decline in female labor force participation. To illustrate, in 2011, for every 1,000 women in a village, 490 were working before the program. After the intervention, in 2017, the number of working women in the control areas went down to 404 keeping with the overall trend but in the treatment villages, the decline was much less to 459. So, in that village, due to the intervention, 55 more women have joined or been retained in the labor force -amounting to a 13.6% increase in WPR. Although we cannot conclusively say much about Labor Force Participation Rates (LFPR) as we do not have data on the part-time and full-time nature of this work, ongoing evaluations may be able to collect and report on these data. Overall, we conclude that, in terms of labor outcomes, the program is achieving or is on track to achieve its objective of improving livelihoods by increasing employment opportunities for women in rural areas with more women now being engaged in self-employment (farm and non-farm) in the short-run. 19

Borrowing, savings and assets
As noted earlier the theory of change for the intervention is that women can gain agency and economic empowerment through formation of social networks and greater access to low-cost credit, combined with skills development (Sanyal, Rao, and Majumdar 2015). The primary funding mechanism for NRLM is micro-loans through community grants and the SHG-Bank Linkage program. The program also has a major focus on promoting savings using the group membership as a commitment device. Therefore, we investigate outcomes related to the extent and nature of borrowings, savings and various asset classes.
We start by looking at overall current savings and the amount saved in the previous 12 months. 20 We find that overall savings have gone up almost 18.6 percent and most of this increase is due to households shifting their savings to formal sources (45.3 percent higher savings in formal sources). But overall, there is only a 5.2 percent increase in share of formal savings to informal savings. We hypothesize that this could be because the share of formal savings is already 83 percent in the whole sample, as the national government has aggressively promoted financial inclusion. Lastly, we find that 6.5 percent more women have a bank account in the treatment group.
Next, we analyze the extent and nature of household borrowing. Consistent with the program focus, the reach of financial services has been expanded in treatment areas, which are completely un-penetrated by other institutional providers. For instance, less than 1 percent of households have a current loan from micro-finance companies and less than 2 percent have loans from commercial banks in the full sample (directly as individuals). But, 35 percent of the households, have a current loan through SHGs in treatment areas and 12 percent in the whole sample. Overall, 24 percent more households have a current outstanding loan in treatment areas when compared to treatment areas. Anecdotal evidence suggests that most of these borrowers are accessing credit for the first time from institutional sources such as these.
As NRLP has been able to facilitate lending of over $1.2 billion since 2011 (World Bank 2018), we examine if this often reported overall borrowing figure has translated into 18 increases in average amounts borrowed at the household level. Data were collected for all active outstanding loans, and we find that there is indeed a huge increase in the amount borrowed at the household level in treatment areas (over two times increase in average borrowing compared to control areas). Although there is an overall increase in the loan size, the average loan size for SHG loan (INR 12,830 per loan) is still lower than other sources such as MFIs (INR 49,350 per loan) and commercial banks (INR 92,290 per loan). Furthermore, we find that the intervention has resulted in altering the nature of borrowing with higher loans being used for full productive purposes (11.2 percent higher in treatment areas). There is also a 4.9 percent reduction in annual interest rates that could be due to the lower cost of the SHG loans or because of the interest subsidies for good repayment. Most of these borrowing related results are quite consistent with other previous impact evaluations of rural livelihoods programs in India.
Lastly, we look the impacts in various asset classes -land, livestock, other productive assets and consumption assets. Consistent with other studies, we find that there is almost no difference between the amount of land owned or operated by the household in treatment areas, but it is possible that three years of participation is a short duration in such complex land markets. The results in livestock assets are clearer with an average increase of 2.9 assets owned currently but there is no impact in the difference of livestock assets since 2011. There is a minor and significant increase in consumptive assets, but the results do not hold up in our robustness checks. For other productive assets we find no significant difference between the treatment and control areas.
To summarize, households in treatment areas are saving and borrowing more. The intervention has been able to expand the reach of financial services in otherwise unpenetrated areas and even reduce the average borrowing costs. This has enabled the households to use these loans for productive purposes with results being significant for livestock assets but no perceptible increase in land or other productive assets yet.

Seasonal Migration and Income
To the best of our knowledge, this is the first time for any major Indian livelihoods programs that the data were collected for seasonal migration, and income by various sources. With respect to seasonal migration, there is a statistically significant and large increase in the number of nights that household members from program areas spent outside their village, mostly to take up `better employment'. However, as the survey did not collect much information on other aspects of employment, the data do not allow us to do any meaningful analysis of reasons for this effect and on other migration related outcomes.
Next, we look at income from various sources as data were collected for household income from migration, agriculture, livestock, casual wages, non-farm enterprises, fisheries, fulltime wage employment, public and private transfers and any other sources. We first look at a few key enterprise level outcomes as that is the major focus of the intervention. Although more households have started non-farm businesses, we do not find any meaningful trends in people employed in these enterprises, or revenues of those enterprises. We did find that due to the increased borrowing, treatment households were able to invest around 15 percent more funds in enterprises, but the sample of enterprises was very small. For most categories of income and total income, we do not find any results in the overall sample but there are several sub-populations that witnessed an income increase due to program participation (see Section 4.2 for more details). We do not report the results of consumption expenditure due the timing of the survey and its related impact on food expenditure. 21

Robustness Checks
We use two-stage IV regressions to check for the robustness of our impact estimates. Most of the results persist but a few such as size of formal savings and collateral requirements do not. It should be noted that matching and IV methods are based on a different set of assumptions. Therefore, IV results are to be examined with respect to checking the strength of PSM estimates. We have also analyzed results from different matching specifications including caliper and nearest neighbor. 22 The field work for the data collection was initiated shortly after demonetization in November 2016. As food expenditure is highly elastic and sensitive to such external shocks, it is likely to be affected most by it. Due to the unavailability of monthly expenditure levels before demonetization in the sample, more detailed technical analysis is needed to understand the role of demonetization on consumption expenditure. Due to the time limitations and lack of additional data, we have not reported any results on consumption expenditure. 22 As the results from other specifications were consistent with the main model, they have not been reported in the paper. 23 Significance levels: *:10%, **:5%, ***:1% Furthermore, we also analyze the Average Treatment Effect on the Treated(ATT) estimates on the program participants in treatment areas. For this estimation, we define the treatment as only the participant households in the treated areas but drop the non-22 participant households in treated areas from the analysis. We use a kernel matching model using the same set of variables (Table A 8) and Figure A 1 shows that there is a satisfactory overlap in the propensity score distribution across program and non-program villages.
Overall, there is limited difference between the nature of ATT (Table A 9) and ITT results. However, most livelihoods and labor related impacts do become stronger and more significant.

Distributional Effects
The primary goal of NRLM is to provide livelihood security to vulnerable rural households. The ITT effect model reported in Table A 7 estimates the causal effect of the binary treatment on program outcomes. Therefore, the NRLP impacts estimated thus far are average estimates of the program effect without distinguishing across different segments of the population of rural households. We use quantile regressions to account for this heterogeneity of treatment effects in distribution of outcome variables, specifically with respect to total household income, income from migration and borrowings. Results from weighted quantile regressions are presented in Table 7. The impact of NRLP varies significantly for total income, income from migration and borrowing quantiles. Based on quantile regression estimates in Table 7  Income from migration is distributed more equitably across quantiles. In terms of magnitude, households in almost all quantiles experienced similar positive change in migration income. For statistically significant quantiles, that is, 0.20. 0.30, and 0.40, income from migration increased by INR 5348.84,INR 5000.00,and INR 5973.79,respectively. Households in higher quantiles of 0.75 and 0.90 did not experience significant change in their migration income.

Figure 3 Quantile Graph 24
With respect to the size of borrowings, the program benefits declined monotonically from lowest to highest quantile. Except for the lowest quantile households (bottom 20 percent of the households) who could borrow INR 866.67 more due to the program implementation, the borrowing levels for other higher quantiles (60 th , 75 th , and 90 th percentile) declined considerably in program areas as compared to control areas. The size of borrowings is still very small but indicates that program targeting is robust in terms of providing benefits to the rural poor households.

Conclusion
The decline in female labor force has been a major policy challenge in India, especially to tackle the falling participation in rural areas. Women-based groups such as SHGs have often been cited as one of the key interventions to address this issue (Sanghi, Srija, and Vijay 2015). The National Rural Livelihoods Mission was initiated with these objectives and is being rapidly scaled up by the Government of India. Thus, understanding the impact and efficacy of this large-scale intervention at an early stage is critical from a policy perspective. Our study provides short-run results based on data from three states where the program has had considerable reach and intensive support through the NRLP. The gradual roll-out of the program over the last five years gives us an opportunity to rigorously analyze the impact and we find four noteworthy results.
First, the intervention has been able to achieve its objective of retaining women in the labor force by increasing opportunities for self-employment, including transitioning agriculture to higher value activities and investments. Our estimates suggest that overall 5.5 percent more women are working (part-time and full-time) due to the intervention overall, with a 7.7 percent increase evident for women of productive ages (15-64 years old), and these effects remain strong in all our robustness checks.
Second, we find large and significant increases in the amount and share of formal savings, probably because of SHGs as a commitment device and as more women have opened bank accounts. Consistent with almost all other studies on the self-help groups, we find that the overall amount and productivity of borrowing at the household level have increased. Similar to Hoffmann et al. (2017), we find a large and significant reduction in the average interest paid by households in the treatment areas. However, the impacts on these aspects are not very different for any sub-population.
Third, contrary to previous studies on similar programs, we find weaker results in the asset status of households. We do find minor increases in livestock assets and consumptive assets, but no increase in other productive assets and nature of landholdings. As the primary mechanism to build assets was through loans, it is possible that the average loan size at INR 12,830 is too small to purchase big and productive assets.
Lastly, we find significant results in median household income as well as large and significant increases for the 30 th , 40 th , and 75 th percentiles. However, we do not find results or trends on annual income of households in the overall population. As the average exposure of the program in the treatment areas is only around 2.5 years and the implementation of technical livelihoods interventions is still at a nascent stage, it is possible that in the long-run some of these results in assets and income might strengthen and downstream impacts might arise.
For future policy discussions, we have some concluding thoughts. First, the focus of future interventions needs to shift from pure access to loans to more targeted borrowing and probably bigger size loans. Second, the program needs to build on increased labor force participation and use it to enhance income generation for participants by accelerating the technical and livelihoods interventions; efforts are also needed to document the impacts of these sub-interventions. New interventions are piloting and testing interventions in these two areas (World Bank 2018). Finally, the ongoing and future impact evaluations of these large-scale government programs need to have a broader focus on labor force participation and income; in particular to better understand the long-term impacts of these programs.