# How were the data analysed?

Having described the tools used to collect data for the project, this section provides an overview of how the data were analysed. Statistical analysis assesses the statistical significance of an estimated relationship - the likelihood that a relationship between two variables is not random. The analysis for this project involved both statistical tests and regression analysis. Statistical tests, such as t-tests and chi-squared tests, calculate the correlation between two variables without controlling for other factors. A t-test compares the means of a dependent variable for two independent groups. For example, it is used to test if there is a difference between the average number of workers hired by an agricultural household with emigrants and one without. A chi-squared test is applied when investigating the relationship between two categorical variables, such as private school attendance (which only has two categories, yes or no) by the children living in two types of households: those receiving remittances and those not. Statistical tests determine the likelihood that the relationship between two variables is not caused by chance.

In addition, regression analysis is useful to ascertain the quantitative effect of one variable upon another, while controlling for other factors that may also influence the outcome. The household and community surveys included rich information about households, their members, and the communities in which they live. This information was used to create control variables that included in the regression models in order to single out the effect of a variable of interest from other characteristics of the individuals, households and communities that may affect the outcome.

Two basic regression models were used in the analysis: ordinary least square (OLS), and probit models. The choice of which one to use depends on the nature of the outcome variable. OLS regressions are applied when the outcome variable is continuous. Probit models are used when the outcome variable can only take two values, such as owning a business or not.

The analysis of the interrelations between public policies and migration is performed at both household and individual level, depending on the topic and hypothesis investigated. The analysis for each sector is divided into two sections:

• The impact of a migration dimension on a sector-specific outcome

• The impact of a sectoral development policy on a migration outcome

The regression analysis rests on four sets of variables:

A) Migration, comprising: (1) migration dimensions including emigration (sometimes using the proxy of an intention to emigrate in the future), remittances and return migration; and (2) migration outcomes, which cover the decision to emigrate, the sending and use of remittances, and the decision and sustainability of return migration.

• 3. UNDERSTANDING THE METHODOLOGICAL FRAMEWORK USED IN CAMBODIA
• B) Sectoral development policies: a set of variables representing whether an individual or household took part or benefited from a specific public policy or programme in four key sectors: the labour market, agriculture, education, and investment and financial services.
• C) Sector-specific outcomes: a set of variables measuring outcomes in the project's sectors of interest, such as labour force participation, investment in livestock rearing, school attendance and business ownership.
• D) Household and individual-level characteristics: a set of socio-economic and geographical explanatory variables that tend to influence migration and sector-specific outcomes.