Parental Networks, Wage Expectations, and the Intergenerational Educational Mobility*

We develop a theoretical labour market model with two generations of workers, endogenous social networks of parents and binary schooling choices of children. Since the market skill premium is unobservable, families rely on noisy wage information obtained from their social contacts giving rise to heterogeneous expectations across families. If social networks are subject to skill homophily and high skill parents are a minority, then children in low skill families are stronger aﬀected by the lack of objective information, their expectations are more dispersed and they are less likely to study giving rise to a positive intergenerational schooling correlation. Next, we calibrate the model to the German labour market data (SOEP) and show that the described mechanism can potentially account for up to 15% of the intergenerational schooling correlation. Moreover, the data strongly supports the idea of skill homophily in the social networks. Finally, we extend the model to analyze the interaction between the network eﬀect and the classical ability transmission for intergenerational mobility. We ﬁnd that the interaction term is always negative, crowding out the network eﬀect, which may help to explain small network and neighborhood eﬀects reported in recent empirical work. Nevertheless, accounting for the network eﬀect improves the ﬁt of the model and helps to explain an inverse U-shape relationship between the intergeneratioanl schooling correlation and the share of high skill parents observed in the data.


Introduction
Vast empirical research shows that educational outcomes of parents and children are strongly correlated 1 . In the meantime there is consensus in the literature that the intergenerational transmission of schooling is accounted for by a collection of factors and mechanisms. These include genetic transmission of cognitive and non-cognitive skills and abilities, wealth inheritance and financial constraints as well as public policies in the educational system. Nevertheless, Bowles and Gintis (2002, p 4-5.) conclude that the "transmission of economic success across generations remains something of a black box" since existing mechanisms explain "at most threefifths of the intergenerational transmission of economic status". Thus in this paper we direct our attention to a novel mechanism and investigate the role of parental social networks for the expectations and educational attainment of children. Moreover, we calibrate the developed model to the empirical data for Germany (SOEP), quantify the contribution of this mechanism and analyze its interaction with other transmission channels.
More specifically, we develop a labour market model with endogenous social networks of parents. In addition, we incorporate recent evidence summarized in French and Oreopoulos (2017) that there is a high degree of uncertainty among the youth concerning the gains and costs of higher education and even the application process itself. So we think of the real return to schooling/skill premium as unknown to the market participants 2 . Hence, young individuals in the model use all available information to obtain an estimate of the return to schooling and use this estimate to make a decision about acquiring higher education or not. More specifically, we assume that the offspring generation uses noisy wage observations of social contacts in parental networks to form their expectations about the market skill premium. In this setting we show that differences in the composition of social networks across families give rise to heterogeneous expectations of children and lead to different educational decisions and labour market outcomes.
When modeling the networks of parents we take into account a common empirical fact that social network formation is subject to skill homophily, meaning that individuals coming from the same skill group are more likely to form social links (see Montgomery (1991) and McPherson et al. (2001)). In a labour market with a binary skill structure this leads to a situation when high skill friends are overrepresented in the social networks of high skill parents, whereas friendship ties with low skill workers are less common in high skill families. The opposite is true for low skill parents. However, the situation of children in the two types of families is not symmetric since high skill workers are a minority and high skill wages tend to be more dispersed than low skill wages. In this setting we show that social networks of high skill parents are more balanced between the two skill groups, thus they possess more precise information about the market skill premium.
Intuitively, this means that expectations about the skill premium are less dispersed across high skill families, and so children in these families are more likely to acquire higher education. This mechanism gives rise to a positive correlation in the schooling outcomes of parents and children in isolation from other traditional explanations of intergenerational mobility.
Concerning expectations, we start with a model where wages of parental social contacts are 1 We review this literature in the next subsection 2 Some justification for this assumption can be found in Card (2001), who presents a summary of 11 studies estimating the return to schooling. The table shows a large variation in the estimates depending on the time period, sample restrictions and the estimation techniques. This evidence suggests that even among researchers there is no consensus about the real value of the market skill premium.
perfectly observed by the offspring in order to provide the intuition, but we calibrate a modified model where wage observations in the social networks of parents are subject to an observation error. The intuitive idea behind this setting is that children obtain an imprecise signal of the actual labour income of a family friend by observing this friend's living expenses (e.g. large or small house, luxus or simple furniture, expensive or cheap car etc.). This modification creates a situation when some wages are observed precisely (e.g. wages of parents), whereas other wage observations are subject to noise leading to a problem of heteroscedasticity in the expectation formation. We show that this problem can be dealt with by applying a weighted least squares approach implying that children should optimally assign a higher weight to the wage of their parent and lower weights to the remaining wage observations stemming from the network of the parent. This approach gives rise to the efficient estimates of the return to schooling. Overall, however, the intergenerational correlation coefficient is only moderately sensitive to the noise parameter since social networks of parents are sufficiently large and the noise variable is unbiased.
In the next step the modified model is calibrated to match empirical observations in the German labour market based on the Socio-Economic Panel (SOEP). The advantage of using this data is that it includes information on social networks combined with education and labour market outcomes of parents and children. We find that the German empirical data on social networks strongly supports the idea of skill homophily. For example, the share of high skill friends in the networks of high skill workers is almost 20% above the population average, whereas the share of high skill friends in the networks of low skill workers is almost 20% below the population average. Based on this information we estimate a model-implied skill homophily paramter and find that it is remarkably stable over time. The data also supports the assumptions that high skill workers with an upper secondary school degree certificate ("Abitur") are a minority in the labour market (24.8% in the generation of parents and 37.8% in the generation of children) and their wages are more dispersed compared to low skill workers. The latter observation has an amplifying effect on the intergenerational transmission of schooling in our model. Based on the performed calibration we find that the described mechanism building on parental social networks and labour market uncertainty can potentially explain up to 15% = (0.055/0.375) of the observed schooling correlation. Furthermore, we use the calibration to analyze the influence of the key model parameters on intergenerational mobility and to evaluate the sensitivity of different mobility measures to these parameters. We find that the cost of education, network size, skill homophily and the share of high skill parents have an inverse U-shape impact on the intergenerational correlation coefficient and the mobility ratio. Quantitatively, however, the effect of the network size is small, whereas both mobility measures are sensitive to the cost of education, the skill homophily and the share of high skill families in the economy. Given that the cost of education (monetary expenses and effort) is generally unobservable and the skill homophily parameter is found to be stable over the last two decades, we focus on the share of high skill families and its changes over time to test the model prediction. When the share of high skill parents is small and increasing it is more beneficial to children in high skill families due to a higher propensity of creating links with other high skill parents. Thus, children in well-off families are more likely to study, thereby increasing the intergenerational schooling correlation and the mobility ratio. However, the marginal gain of children in higher status families falls if the share of these families continues to grow, leading to a reversed effect and an inverse U-shape relationship. In order to test this prediction we split the sample into 15 birth cohorts with the earliest being 1953-1955 and the most recent one born in [1995][1996][1997]. The data confirms the prediction showing that the share of high skill parents increased rapidly in the considered period, whereas both measures of intergenerational mobility increased initially, achieved a maximum for the birth cohort 1983-1985 and declined thereafter.
Finally, we extend the model to allow for the transmission of abilities, which can be genetically inherited (nature) or transmitted in early childhood (nurture). The purpose of this extension is to analyze the interaction of the new network-driven and the traditional abilitydriven components in shaping the intergenerational schooling correlation. In this extension we prove theoretically that the interaction effect is always negative. As the ability correlation of parents and children gets stronger, the number of high ability children born into low skill families is declining making the informational disadvantage of low skill families less important and crowing out the network effect. This theoretical finding may be relevant for the empirical research on intergenerational mobility since a failure to account for the interaction between ability transmission and the network/neighborhood effect may lead to a downward bias and low estimates of the latter (see for example, Chetty et al. (2016) and Plug et al. (2018)). Despite this crowing out effect we find that a combination of the ability transmission and the network effect substantially improves the predictive power of the model and its fit to the data, especially in the tails, where the share of high skill parents in the market is very small or very large.

Related literature
Our study is related to several strands of literature. Empirical work on measuring intergenerational correlations between the income/schooling levels of parents and their children in a multi-country framework includes Björklund and Jäntti (1997), Chevalier et al. (2003), Hertz et al. (2007), Bratberg et al (2016 and Leone (2019). Hertz et al. (2007) provide an extensive survey of estimates on the parent-child educational correlations. Their estimates vary from 0.45-0.55 for low mobility countries (Italy, Slovenia, Austria, Ireland, USA, and Switzerland) down to 0.3-0.4 for countries with high educational mobility (Sweden, Belgium, Netherlands, Norway, Finland, UK, and Denmark). Jäntti (1997) andBratberg et al. (2016) confirm the finding that intergenerational mobility is lower in the USA compared to Sweden and Norway.
Empirical evidence for Germany is provided in Couch and Dunn (1997), Heineck andRiphahn (2009), Riphahn andSchieferdecker (2012) and Leone (2019). The former study utilizes early waves of the Socio-economic Panel and reports the estimates of the intergenerational schooling correlations in the range 0.237 − 0.39 depending on the gender of the parent and the child, which puts Germany into a group of studies with relatively high intergenerational mobility. Empirical estimates reported in Leone (2019) support this view.
The literature proposes several explanations for the observed correlations (see Piketty (2000) for more details). First, cognitive and non-cognitive skills and abilities, as well as preferences and social norms, may be inherited by children from their parents (Becker and Tomes (1986)). On the one hand, abilities can be transmitted genetically from parents to their biological children (nature). On the other hand, better educated parents are more efficient in transmitting knowledge and supporting their children in the studying process (nurture). Several empirical studies report strong correlations in cognitive abilities of parents and children based on IQ test scores highlighting the importance of a combined effect of nature and nurture (Black et al. (2009), Björklund et al. (2010). Sacerdote (2007) undertakes an attempt to separate the contribution of nature and nurture by comparing educational outcomes of adoptive and biological children and reports that the genetic transmission of abilities contributes more to explaining educational attainment of children. The study by Holmund et al. (2011) comes to a similar conclusion that parental nurture effect does not play a large role.
Second, if the credit market is not perfect, financial constraints may prevent children in poor families from acquiring education (Galor and Zeira (1993)). Yet, empirical support for this transmission channel is rather weak, for example, Carneiro and Heckman (2002) and Belley et al. (2011) find little evidence that short term credit constraints explain a part of the gap in college enrollment between children from high and low income families. Third, the institutional design of the education system may play a crucial role in the attainment of education (Checchi et al;1999, Checchi and Flabbi;2007). These studies argue that countries with education systems characterized by equal access to education are more likely to have higher educational mobility, and the public expenditure could be one of the effective factors for increasing equal opportunity in education. For example, Chevalier et al. (2003) find a negative relationship between the intergenerational schooling correlation and the share of public expenditure in tertiary education.
Forth, and most closely related to our research is the explanation that children's education levels and their labor market outcomes can also be affected by the neighborhood or social environment where they grow up (Chetty et al. (2014(Chetty et al. ( , 2016, Katz et al. (2001)). Here one can distinguish between the social and the geographical dimension even though the two are closely  (2020) suggest that the quality of school and neighborhood could lead to an increase in the share of graduating students. But on the other hand, Chetty et al. (2016) analyze the impact of the "Moving to Opportunity" experiment in the USA and find that moving to a better neighborhood has negative impact on the educational outcomes of adolescences and a moderately positive impact for children who were young at the time of the family move.
The literature on the network dimension of the living environment is rather scarce. Several empirical studies investigate the effect of parental job referrals on their children's labour market outcomes. For example, Kramarz and Skans (2014) analyze Swedish data and find that the parental networks effects are stronger for lower educated children when they seek their first jobs. Corak and Piraino (2011) consider the intergenerational transmission of employers and document that the probability of having the same employer for parents and children is higher for the top-earning groups in Canadian data. Going beyond referral hiring Plug et al. (2018) investigate the effect of parental friendships on the labor market outcomes of children. They find that the friendship networks of parents have a stronger effect on the occupational choices of children at the early stages of their careers. Comparing the network and the geographic components of the living environment Del Bello et al. (2015) suggest that the peer effect on education is stronger than the effect of neighborhood.
To the best of our knowledge there is only one theoretical study investigating the impact of social networks on intergenerational mobility by Calvó-Armengol and Jackson (2009). The key element of their model is an assumed complementarity in the actions of players, meaning that a person is more likely to choose a high action (e.g. higher education) if there are many actors in the surrounding environment taking the same action. This assumption leads to situations when the actions of parents and children are correlated due to the common exposure to the same social environment without a direct transmission of skills or abilities from parents to children. In this paper we continue this strand of research but deviate from the exogenous complementarity assumption. Instead, we look deeper at the underlying mechanisms of network and expectation formation in a situation when the market skill premium is unobservable giving rise to heterogeneous expectations and correlated schooling outcomes of parents and children. Hence we contribute to the described literature in the following ways: 1) by advancing and extending the theoretical framework modeling the link between social networks and intergenerational mobility 2) by analyzing the impact of network structure (e.g. network size and skill homophily) and labour market uncertainty on intergenerational mobility measures, which is only possible if the network structure is modeled explicitly 3) by quantifying the potential contribution of the developed mechanism based on the German labour market data 4) by shedding light on the interaction between traditional explanatory factors, e.g. ability transmission, and network effects, which can guide future empirical work. The article is organized as follows. Section 2 presents the core theoretical model with endogenous social networks, heterogeneous expectations of the offspring generation and their educational choices. Section 3 describes the data and the calibration approach. Section 4 presents a number of numerical results complementing the theory. Section 5 develops an extension of the core model accounting for the transmission of abilities and investigates the interaction between the network component and the ability component of intergenerational schooling dependence measures. Section 6 concludes the paper.

The Model
In this section we develop the analytical model and consider the choice of schooling from an intergenerational perspective. There are two generations of workers: parents (first generation) and children (second generation). We abstract from the fertility issues and assume "one parent -one child"' family structure. In section 1 we explain how the social networks of parents are formed. Further, in section 2.2 we show how the generation of children forms expectations about the market return to schooling (skill premium) by observing wages in the networks of their parents. Based on the estimated return to schooling, inborn ability and the corresponding cost parameter they decide to become high or low skilled, thus we model schooling as a binary choice variable. Parents and children share the same information and their goals are aligned.
First generation workers are heterogeneous with respect to their skill, hence some children are born in low skill families L, while others are born in high skill families H. Let h < 0.5 denote the fraction of high skill families/parents and 1 − h -fraction of low skill families/parents.
All parents are employed in the labour market and receive (log)wages w L i or w H i depending on their skill level 3 . Wage differences among workers with identical skills could stem from the hetereogeneity of employers or reflect different wage strategies of identical firms (e.g. endogenous wage dispersion in the spirit of Burdett and Mortensen (1998)). We interpret these (log)wages as life-time income and assume that they are normally distributed with the corresponding means High skill workers earn more on average, so that The assumption that (log) wages are normally distributed is not restrictive since the model is based on average wages, which would be asymptotically normally distributed under the Central Limit Theorem (given independent draws). However, we also consider a weighted average wage, where the CLT is not applicable.
The inborn ability of children can be high with probability p and low with probability 1 − p.
In the main specification of the model we consider the setup where p is identical in high and low skill families, therefore any correlation in the educational attainment of parents and children is due to differences in the available information about the labour market obtained via the parental network of social contacts. In section 5 we consider an extension of the model where p H > p L giving rise to a positive correlation in abilities between parents and children. For this case we derive a novel decomposition of the intergenerational schooling correlation into the ability component, the network component and the interaction term.
For high ability children the cost of obtaining education is denoted by c 4 . We assume that the cost of education is below the skill premium, that is: , thus education is gainful on average for high ability individuals. The cost parameter c includes monetary costs and effort associated with studying. High ability children observe wages and skills among the contacts of their parents and use this information to form an expectation about the market skill premium. This is a benchmark specification. In section 2.4 we consider a modified specification where wages of social contacts are observed with an error/noise. Based on the estimated skill premium and the cost of obtaining education c high ability children decide whether they obtain higher education and become high-skilled workers or not. The cost of obtaining education for 3 In the extended model we allowed for a higher risk of unemployment faced by low skill parents, however, it has negligible impact on the quantitative results and is disregarded in the following for the ease of exposition 4 The cost is expressed as a % of the average low skill wage to be comparable with the return to schooling low ability children is prohibitively high, so they do not consider acquiring education.

Social networks
In this section we analyze network formation in the first generation of workers. In doing so we follow the random matching approach developed in Neugart and Zaharieva (2018). Formation of social links is subject to skill homophily, that is workers are more likely to create social links with others of the same skill type. In general, homophily refers to the fact that people are more likely to maintain relationships with other people who are similar to themselves. There can be homophily measured by age, race, gender, religion or skill and it is generally a robust observation in social networks (see McPherson et al. (2001) for an overview of research on homophily). The focus of this paper is on the latter type of homophily by the level of education/skill. To the best of our knowledge, Montgomery (1991) was the first study that introduced network homophily (the "inbreeding bias") into an analytical labour market model.
At rate φ every worker can be randomly matched with another worker. Let τ denote the probability of creating a social link with a worker of the same skill type and (1 − α)τ , 0 < α < 1 -be the probability of creating a link with a worker of a different skill type (conditional on matching). Note that the special case when α = 0 corresponds to the situation without homophily, while α = 1 corresponds to the complete network segregation by skill. Hence α corresponds to the degree of skill homophily in the society. In order to keep the model tractable we consider directed links. This means that, if two workers i and j are randomly matched, it is possible that j becomes a social contact of i but not necessarily vice versa. Every social link can be destroyed at rate δ.
Let ε ij k denote a fraction of type i workers with exactly k social contacts of type j, i, j ∈ {L, H}. This is a fraction out of all type i workers. Consider some type L worker without contacts of the same type. With our notation this worker belongs to the group ε LL 0 . At rate φ this worker is matched with some other individual. With probability 1 − h this individual is of the same type L, so the social link is created with probability τ . Next, consider another type L worker with only one contact of the same type belonging to the group ε LL 1 . This person may lose the only contact at rate δ. In the steady state, the propensity for the worker to make transitions between the two states k − 1 and k will be equalized, this means: Let ϕ ≡ φτ /δ to simplify the notation. Since all fractions ε LL k should add up to 1 for k = 0...∞ we get: ε LL 0 = e −ϕ(1−h) and the number of type L contacts has a Poisson distribution with parameter ϕ(1 − h). This also means that, ϕ(1 − h) is the average number of low-skilled contacts in the social network of a randomly chosen low-skilled worker, we denote it by n LL , so that n LL = ϕ(1 − h) and: Alternatively, the type L person can be matched with another person of type H, which happens at rate φh. Conditional on matching the social link is formed at rate (1 − α)τ . Repeating the same steps as above we get: Here n LH is the average number of high-skilled contacts in the social network of a randomly chosen low-skilled worker. Let n L denote the average network size for type L workers and γ L be the fraction of type L contacts in their network, so we get: Using the same approach for type H workers we get: where n H is the total network size of high-skilled workers and γ H is a fraction of type H contacts in their network. One can see that the case of full homophily (α = 1) leads to the complete segregation of social networks between the two skill groups, that is γ L = γ H = 1. In the opposite case without homophily (α = 0), the fraction of contacts of the same type is equal to the fraction of this type in the total population, that is γ L = 1 − h and γ H = h. We can also see that highskilled contacts are underrepresented in the networks of low-skilled workers (1−γ L < h), whereas they are overrepresented in the networks of high-skilled workers (γ H > h): Recall that h is the fraction of high-skilled workers (parents) in the labour market, so we use it as a point of reference in the above expressions. Comparing the average sizes of social networks for low and high skilled workers we get the following: Lemma 1: The networks of high skill workers/parents are more balanced, meaning n LL − n LH > n HL − n HH but they are smaller on average, n H < n L compared to the networks of low skill workers/parents. Proof: Lemma 1 shows that the networks of low skill families are more extreme compared to the networks of high skill families. There are two reasons for this finding. First, the number of high skill contacts in the networks of low skill families n LH = ϕh(1 − α) is small because high skill workers are the minority h < 0.5. Second, it is small because of skill homophily 0 < α.
At the same time, the number of low skill contacts in the networks of low skill families is large n LL = ϕ(1 − h) because low skill workers are the majority in the labour market.
Finally, we can calculate the average size of the social network in the economy, which is denoted by n: We can see that parameter ϕ can be interpreted as a maximum average network size in the absence of homophily (α = 0). The reason is that any positive homophily α > 0 reduces the network size because fewer matches become social ties.
This subsection shows that high and low skill parents have different composition of social networks. Therefore, information about the market skill premium obtained by children through the social network of their parents is likely to vary across families, giving rise to heterogeneous expectations. We continue by analyzing this point in the next section.

Labour market
In this section we use the network composition of parents derived above and conditional on their skill type and proceed by analyzing the educational decisions of children. In particular, we assume that children observe the skill types and wages of the social contacts of their parents and form expectations about the market skill premium (return to schooling). This is a benchmark specification. In section 2.4 we consider a modified specification where wages of social contacts are observed with an error/noise.
First, let us consider some family with a low skill parent j. Each low-skill parent has n LL and n LH number of low and high skilled contacts, respectively 5 . Let w LH j = n LH i=1 w H i /n LH be the average wage of all high-skilled contacts in family j. When considering an average wage of low skill contacts, the wage of the parent w L j is an important source of information from the perspective of the child. Thus, we assign a weight 0 < s < 1 to w L j when calculating the average low skill wage estimated by the child in family j. This means: Note that s = 1/(1 + n LL ) would mean that the wage of the parent is equally important as all other wages of low skill contacts (equal weighting). For s > 1/(1 + n LL ) children assign a higher weight to the wage of the parent, thus variable s captures the importance of the parental wage for the decision of the child. At the same time the weight coefficient s does not introduce a bias in the expectations of children because w L j is drawn from the same distribution as wages of the social contacts, so that the expectation is unbiased However, we show later that the efficiency (variance) of w LL j is influenced by s. 5 These numbers are averages, however, we take them as an approximation for differences in the network composition across families and ignore the stochasticity of the number of social contacts in the following analysis Considering the population of all families with low skill parents and assuming that wage draws are independent implies that the average high skill wage w LH j is normally distributed with its mean and variance: In a similar way, the average low skill wage w LL is normally distributed with the following mean and variance: Note that the underlying normality assumption of (log) wages is necessary for s = 0 and s = 1/(1 + n LL ), however, it becomes redundant for s = 0 or s = 1/(1 + n LL ), when the Central Limit Theorem would apply.
be a skill premium estimated by family j (with a low skill parent). Its distribution in the population of low skill families can be written as: Let the corresponding cumulative distribution function be denoted by Φ L . Note the following: This shows that in the extreme case when s = 1 second generation individuals take their parent's wage as the only observation and ignore wages of the low skill contacts of their family. In this case the estimates of the skill premium Δ L j are likely to be very dispersed. Next, consider high-skill families. The distribution of the skill premium Δ H j estimated by these families is given by: where n HH is the number of high skill contacts and n HL is the number of low skill contacts of the family/parent. Let the corresponding cumulative distribution function be denoted by Φ H .
Comparing the distributions for Δ L j and Δ H j we can see that the estimates of the skill premium/return to schooling are unbiased in both types of families, however, the two variances can be different. The reason is that the precision of information available to children in the two different types of families depends on the structure of social networks, that is the network size parameters n HH = ϕh, n HL = ϕ(1 − h)(1 − α), n LL = ϕ(1 − h) and n LH = ϕh(1 − α). In proposition 1 we compare the two variances and summarize our findings: , then the variance of the skill premium Δ L j is greater than that of Δ H j when: Proposition 1 shows that the estimated return to schooling is more dispersed in low skill families, implying more uncertainty, if the weighted variance ratio is above 1 but below an endogenous upper bound. In order to understand this condition consider a simplified case when s = 0, that is the wage of the parent is not included in the estimation.
The two variances then become: and their difference can be written as: In the following we assume that it holds until the end of the paper. This condition becomes more likely when wages of high (low) skill workers are more (less) dispersed. Moreover, is larger when network homophily is stronger (higher α) and when social networks are smaller (smaller ϕ). In the appendix we show that this condition is sufficient but not always necessary for small values of s above zero but below a given threshold.
However, when s is large, meaning that children overweight the importance of parental wages, the situation is different. In order to understand this case consider an extreme point s = 1. The two variances are then given by: and their difference can be written as: This condition gives rise to the upper bound of the weighted variance ratio described in proposition 1. Intuitively, if s = 1 then children rely exclusively on the wage of their parent when forming expectations about wages in a respective skill group. If high skill wages are much more dispersed than low skill wages violating the upper bound condition, then children in high skill families possess less precise information about the skill premium than children in low skill fam- . Therefore, the upper bound condition is necessary and In the appendix we show that this condition is sufficient (but not always necessary) for large values of s above a given threshold and below 1.
Thus proposition 1 describes a set of sufficient conditions leading to a situation when the skill premium estimated by the low skill families Δ L j is more dispersed and uncertain compared to the skill premium estimated by the high skill families Δ H j . Next we investigate the decision to acquire education. Consider the generation of children in low skill families, they decide to obtain higher education if the cost c is low: c < Δ L j , so the number of children obtaining education in low skill families is p(1 − Φ L (c)). The corresponding Under the conditions described in proposition 1 the distribution Φ L (.) is more dispersed than Φ H (.), thus we can say that Φ L (.) is obtained from Φ H (.) by a mean preserving increase in spread. Moreover, the two cumulative distribution functions satisfy a single-crossing property and intersect only once at the mean. Following Diamond and Stiglitz (1973) this property implies that: This shows the main result of our study: (high ability) children in low skill families face stronger uncertainty when forming expectations about the skill premium, so they are less likely to obtain higher education compared to (high ability) children in high skill families who possess more precise information: intuition for this result can be explained in the following way: low skill workers are the majority in the market, thus all families have many contacts of this type and the estimate of the average low skill wage is relatively precise. The major difference comes in the estimate of the average high skill wage. High skill workers are the minority, moreover, there are relatively few of them in the networks of low skill families due to the skill homophily, thus the average high skill wage is estimated with low precision and high degree of uncertainty in low skill families. The high degree of uncertainty in turn implies that high ability children in low skill families are more likely to make a mistake by not obtaining the education compared to children in high skill families supplied with more accurate information from the high skill contacts of their parents.
This mechanism can be substantially reinforced (mitigated) if high skill wages are more (less) dispersed compared to low skill wages.
Finally, we investigate how the two network characteristics α and ϕ influence the decision to acquire education, our results are summarized in lemma 2: Lemma 2: In both types of families more children obtain education if the network size Recall that parameter ϕ is a maximum network size in the absence of skill homophily. Lemma 2 shows that having larger networks improves the quality of information, so the estimates of the return to schooling Δ k j , k = L, H become more accurate and less dispersed. Hence, fewer children in both types of families make mistakes. The opposite is true when the homophily parameter α is increasing. Stronger network homophily implies fewer social links between the two groups of families and reduces the precision of the skill premium estimates. In absolute terms both effects are stronger for low skill families since their networks are very asymmetric with only a few social links to high skill families making their decisions more sensitive to the network parameters (see lemma 1).

Measures of intergenerational skill dependence
In this section we derive two indicators measuring the intergenerational skill dependence: the correlation coefficient ρ and the mobility ratio m -and investigate how these measures depend on the underlying network and labour market parameters. Let ξ be a binary indicator corresponding to the education of parents, and taking value 1 for high skill parents and 0 otherwise. We know In a similar way, let η be a binary indicator corresponding to the education of children. Based on our results from above we get the following joint distribution of ξ and η: ) -average education level in the second generation of workers. The intergenerational correlation in educational choices ρ is then given by: From equation (1) we can see that the intergenerational schooling correlation is positive if social networks exhibit skill homophily (0 < α < 1). In a special case when the economy is in a macroeconomic steady state such that the fraction of high skill workers is constant over time, h =h, we find that ρ = pΔΦ(c), so it is largely driven by different decisions of children in high and low skill families. Further, the theoretical transition matrix allows us to analyze the effect of social network on the intergenerational mobility in education. Following the definition by Bauer and Riphahn (2007), we use the following measure of mobility: , so thath = px, then the correlation coefficient ρ can be rewritten as: The fraction 1/p in the last expression is the only impact p, this shows that ∂ρ/∂p > 0 .
The intuition behind lemma 3 can be described in the following way. The difference p(Φ L (c)− Φ H (c)) reflects a gap in the share of high ability children who make a mistake by not studying due to the poor quality of information. If the share of high ability children is increasing, meaning a higher p, then a larger number of children consider the option of acquiring education and can potentially make a mistake, so the gap p(Φ L (c) − Φ H (c)) is increasing, which is contributing to a stronger correlation coefficient ρ. There is also an opposing effect of p on the variance term h(1 −h), but we show in the proof of lemma 3 that this effect is always dominated.

Observation error
In the previous section we assumed that wages of social contacts are perfectly observable, which is a relatively strong assumption. In the present section we relax this assumption and consider a situation when wages of friends are observed with an observation error . Let L i be the observation error for low skill wages and H i -for high skill wages. We assume that the two error terms are normally distributed with a zero mean and variances σ 2 L , σ 2 H , respectively. Thus children in family j observe wage signals w L i + L i for low skill contacts of their parents and w H i + H i for high skill contacts of their parents. There is no error for observing the wage of the parent. The average wage of low skill friends in low skill families w LL j has the following form and distribution: In a similar way we obtain the distribution of the average wage of high skill friends in low skill families w LH j : So the skill premium estimated by children in low skill families has the following distribution: In the group of high skill families, the skill premium Δ H j has the following distribution: We can see that larger variances of the error terms, i.e. larger σ 2 L and σ 2 H , increase uncertainty about friends' wages and raise variances of the estimated skill premia in both types of families.
Proposition 3 shows the exact weights s L and s H that the generation of children should assign to parental wages in order to deal with the problem of an observation error. In the absence of noise these weights are equal to 1/(1 + n LL ) for low skill families and 1/(1 + n HH ) for high skill families. In this special case the variance of the return to schooling V [Δ k j ] is already minimized if a simple arithmetic average of all available wage observations is used for the estimation, so the estimated returns to schooling Δ k j , k = L, H coincide with the classical OLS estimator and are BLUE (Best Linear Unbiased Estimators). The situation is different in the presence of noise since the precision of the available wage observations is different (heteroscedasticity), thus a weighted least squares estimation should be applied giving rise to the updated weights s L and s H . This weighting scheme addresses the problem of heteroscedasticity and guarantees efficiency of the estimated return to schooling Δ k j , k = L, H. In order to compare the weights for high and low skill families and also the weights with and without noise we introduce an auxiliary factor z > 0 such that Intuitively, this means that the variance of the error term is proportional to the variance of the actual wage. Thus low (high) skill wages are observed with larger (smaller) precision if their variance is smaller (larger), that is σ 2 For z = 0 we get the benchmark case from the previous section when friends' wages are observed without an error. This allows us to formulate the following corollary: Corollary 1: Let z be the factor of proportionality, such that σ 2 k = zV [w k i ], k=L,H. Children in high skill families should optimally assign a higher weight to the wage of their parent compared to children in low skill families since n HH = ϕh < ϕ(1 − h) = n LL if h < 0.5, ∀z ≥ 0. If z is increasing, then children in all families should assign a higher weight to the parental wage, that is ∂s k /∂z > 0 since: This corollary shows that children in high skill families should optimally put a higher weight on the wage of their parents since the number of high skill contacts in their families is smaller on average than the number of low skill contacts in low skill families. Inserting the optimal weights s k , k = L, H into the equations for variances V [Δ k j ] we get the following expressions: If s is large such that s > s H and increasing, then the fraction of educated children 1 − Φ k (c), k = L, H is decreasing in s in both types of families, so the impact on the mobility indicators m and ρ is ambiguous. However, when s is in the middle range, such that s ∈ [s L ..s H ] and increasing, this leads to a higher mobility ratio m. The reason is that putting more weight on the parental wage in this range improves the precision of the return to schooling for children in high skill families and reduces the probability of making a mistake by not obtaining education (higher 1 − Φ H (c)). In contrast, putting more weight on the parental wage for children in low skill families reduces the precision of their estimates and makes it more likely that they make mistakes by not studying.
In the next step we investigate the impact of h, which is the share of high skill families, on intergenerational mobility. This gives rise to the following proposition: Proposition 4: Consider the efficient estimator of the return to schooling Δ k j , k = L, H evaluated at the optimal weight s k , k = L, H. Then the share of children obtaining education in

Proof: Appendix
When h is increasing there are two effects on the network composition of low skill families.
On the one hand, the share of low skill contacts in their networks n LL = ϕ(1 − h) is decreasing.
On the other hand, the share of high skill contacts n HL = ϕh(1 − α) is increasing. Proposition 4 shows that this second effect always dominates for low skill families under the aforementioned The reason is that the social networks of low skill families are very asymmetric (lemma 1) and gaining more high skill contacts has a stronger impact on the precision of their estimates than losing some low skill contacts. Thus a larger fraction of high skill families in the market improves the quality of information available to children in low skill families. So the share of children obtaining education in low skill families p(1 − Φ L (c)) is increasing in h.
The networks of high skill families are more balanced. The positive effect of having more high skill contacts n HH = ϕh and leading to a higher precision of the estimated return to schooling V [Δ H j ] is dominating for low values of h. However, when h is large (above h * ) losing low skill contacts from the network leads to a substantial loss in the precision and raises the variance V [Δ H j ]. Thus, fewer children in high skill families obtain education. Combining these findings proposition 4 shows that the mobility ratio should be eventually decreasing in h when h is sufficiently large.
More generally, our findings in proposition 4 suggest that the relationship between the share of high skill families h and the measures of intergenerational skill dependence should be humpshaped. Even though the effect on the correlation coefficient is ambiguous it is even more likely to be hump-shaped than the mobility ratio since it has a variance term h(1 − h) in addition to the network effects described in proposition 4. We investigate the impact of h on the correlation coefficient in more details in the following numerical analysis.

Calibration
We calibrate the model using data from the German Socio-Economic Panel (SOEP). The purpose of calibrating the model is twofold. The first goal is to provide a range of quantitative estimates of the correlation coefficient that can potentially be attributed to the mechanism described in our model, that is the impact of parental networks on educational decisions of the young generation via the expectation channel. The second goal is to perform comparative statics with respect to parameters, whose theoretical effect on the correlation coefficient and the mobility ratio is ambiguous.
Our empirical data (SOEP) is a representative longitudinal household panel and approximately 15000 households and 30000 individuals participate in the survey (details can be found in Goebel et al. (2019)). For calibrating the social networks we use the SOEP survey from 2016 which includes a "Family and Social Networks" module. More specifically, respondents were asked to provide some information about three friends or acquaintances. The inclusion of this module makes the 2016 SOEP cross section suitable for this part of our calibration. Additionally, the SOEP provides biography information on the surveyed individuals including own and parental education.
We assume the second generation to be the survey respondents in the age range 21-65. We However, not all students who attain Abitur proceed to enrol in university. Also, there are options for students with intermediate secondary school degree to build upon and qualify for enrollment in a tertiary degree (Witte and Kalleberg, 1995). We define as high-skilled an individual who has obtained an upper secondary school degree certificate (Abitur). Further, our cross tabulations show that some respondents obtained a tertiary degree or were enrolled at a university or university of applied science at the time of the survey, but did not have Abitur.
This group is also defined as high-skilled.
The "Family and Social Networks" module in the 2016 SOEP survey allows us to construct a social network of the respondents and estimate the relative network homophily α. Specifically, the survey participants were asked to consider three friends or acquaintances. The exact formulation of the question is: "Please think of three people outside of your household who are important for you, personally. They can be relatives or non-relatives". The respondents are then asked to provide information about their social connections including the highest school degree attained. We assume that a high-skilled contact is a person who has "Abitur / Hochschulreife / Fachhochschulreife" which corresponds to Upper Secondary School Degree and eligibility to enrol in a tertiary degree. We do not have information on the tertiary education of the friends and acquaintances.   13260 ≈ 0.647. If the formation of social links did not exhibit skill homophily (α = 0), we would expect that the fraction of same type contacts, γ L and γ H equal the fractions of low-and high-skilled individuals in the network, respectively. We see that this is not the case. While, 42.9% of the individuals in the sample are high-skilled, 64.7% of their contacts are also high-skilled. On the other hand, 57.1% of the respondents are low-skilled but 76.5% of their contacts are also low-skilled.
In order to calculate the relative network homophily α, we use the fact that the number of same skill contacts is ϕ((1 − h) 2 + h 2 ) while the total number of contacts is ϕ (1 − 2αh(1 − h)).
Based on table 2 we have h 2016 = 0.429. Plugging this value in, we find the relative network homophily α = 0.584.
One caveat of our data is that we do not observe the social contacts of the parents. Therefore, we calibrate the value of the relative homophily parameter α based on the network of the children.
To check whether this approach is reasonable, we calibrate α using earlier waves of the SOEP survey and examine whether the relative network homophily substantially changes over time.
A robust value of α could give us an indication that we can approximate the homophily level of the parent's network from the children's network. We choose previous waves of the SOEP   survey which also include the "Family and Social Networks" module. Specifically, in the years 2011 and 2006, the question about highest school leaving certificates of three friends was asked.
Using the same approach discussed above, we estimate α = 0.598 from the 2011 SOEP survey and α = 0.595 from the 2006 survey. Table 3 shows comparison of the networks based on the three waves of the SOEP survey. We see that the estimate for the relative network homophily does not change by much in the different survey waves which gives us an indication that the calibration is appropriate. In our numerical analysis we further explore the effect of varying α on the key model variables.
To calibrate the relevant labour market parameters, we use the more recent 2018 wave of the SOEP survey which provides us with a larger sample but does not include information about social connections. Again, we consider survey respondents in the age range 21-65 and exclude those for whom we do not have information about their secondary education. This leads to a sample size of 15932 individuals, 37.8% of whom are high-skilled, so we know that h = 0.378. SOEP 2018 has detailed information about parental education. The definition of high-skilled parent is consistent with that of a high-skilled child. Specifically, high-skilled parents are those who obtained a secondary school degree certificate (Abitur) and/or a tertiary degree.
We drop the cases for which there is no educational information for either one of the parents (1238 observations). Also, we drop the cases in which both the mother and the father have  Next, we estimate the first and second moments of the wage distributions of low-and highskilled workers. The survey respondents are asked how much they earned from work last month.
This includes both wages earned as an employee as well as income from self-employment. We consider only those who declare that they are full-or part-time employed. Figure 2 shows the kernel density estimates of the wage distributions conditional on skill level. We drop wages above the 99.9 th percentile of the overall income distribution in order to exclude outliers from the calibration.  low skill families inline with proposition 2.
The last parameter p is the probability that the child has high abilities. We calibrate this parameter by targeting the fraction of high skill individuals in the 2018 SOEP sample, that is h = 0.378, producing p = 0.474. which shows the probability that a child born into a high-skilled family becomes high-skilled relative to a child born into a low-skilled family. There is a two and half time difference in the two probabilities.  at the education cost c = 0.278 when p = 0.474 which is the benchmark case. Given that our data does not include information about the cost c, moreover, this cost is unobservable since it includes monetary and non-monetary components (e.g. effort) we interpret our result as an upper bound of the correlation coefficient. We believe that this approach is more informative than an alternative point estimate based on ad-hoc assumptions about the cost. Given that the empirically observed correlation coefficient is equal to 0.375, we conclude that parental networks and their impact on expectations can possibly explain up to 14.7% of the intergenerational correlation coefficient (that is 0.055/0.375). The upper bound of the mobility ratio is 1.177, which means that the probability of acquiring higher education is almost 1.2 times higher for children in high skill families compared to children in low skill families. Note that these estimates refer alone to the impact of parental networks on expectations and do not include other factors. The right panel of figure 4 shows that there is a non-monotone relationship between the intergenerational schooling correlation and the share of high skill families h. The same is true for the impact on the mobility ratio on figure 5. The reason is that a higher fraction of high skill parents reduces the informational disadvantage faced by children in low skill families, so that more children in these families acquire education inline with proposition 4. But the effect on children in high skill families is ambiguous, on the one hand, there is lower uncertainty about high skill wages, but on the other hand, there is higher uncertainty about low skill wages. The first effect dominates when h is small, implying a steep increase in 1− Φ H (c), whereas the second effect dominates when h is sufficiently large, implying a moderate reduction in 1 − Φ H (c). This leads to a hump-shaped relationship between the share of high skill parents and the two mobility measures ρ and m, whereby for ρ the effect is amplified by the additional term h(1 − h). We find that the pivotal point h * described in proposition 4 is equal to 0.35, however, both mobility measures achieve a maximum for lower values of h. In section 5 we extend the model to account for the inheritance of abilities and compare the impact of h on ρ in the model and in the data. The right panels of figures 6 and 7 illustrate the impact of the skill homophily parameter α. Even though the quality of information is worsening with α in both types of families and fewer children obtain education (lemma 2), the negative consequences are stronger for low skill families when α is small. The reason is that their networks are less balanced (lemma 1), so a stronger group segregation (i.e. higher α) rapidly reduces the small number of valuable high skill contacts in low skill families. However, as α → 1 both types of families become "infinitely uncertain" about the market skill premium leading to a reversed effect of α. In the limiting case α → 1 the impact of the family background becomes irrelevant since: One interesting observation from the right panels of both figures is that there exists a lower bound of the correlation coefficient ρ and mobility ratio m, when there is no network homophily: α = 0. Comparing variances of the two skill premiums for α = 0, we get: Further, we analyze the effects of network parameters ϕ and α (figures 6 and 7). In particular, we vary the size parameter ϕ between 20 and 60, so the average network size n changes between 15 and 50. We can see that both variables have an inverse U-shape influence on ρ and m. As demonstrated in lemma 2, a higher size ϕ reduces uncertainty about the market skill premium in both types of families. For high skill families this effect is stronger at low values of ϕ, because their social networks are smaller on average (see lemma 1) compared to low skill families. Thus the marginal gain of having larger social networks is stronger for high skill families leading to an increasing correlation coefficient ρ at low values of ϕ. However, as networks get large the marginal gain of high skill families is diminishing and falls below that for low skill families, leading to a reverse relationship between ϕ and ρ.
is above 1, which holds in our data. This leads to a situation when the number of high skill children acquiring education in high skill families p(1−Φ H (c)) = 0.449 is larger than the number of children acquiring education in low skill families p(1 − Φ L (c)) = 0.443, so that ρ > 0 and m > 1. Intuitively, this situation arises because some children have high skill parents and, therefore, one high skill contact more on average (and one low skill contact less) than children in low skill families. Given that information about high skill wages is more valuable since high skill wages are more dispersed and the number of high skill people in the market is small, this asymmetry benefits children in high skill families and leads to ρ > 0 even if parental networks are identical and there is no skill homophily, that is α = 0.
Note, that our numerical results allow us to make a number of quantitative statements concerning the sensitivity/elasticity of the two measures of intergenerational mobility ρ and m to the considered parameters. In particular, we can see the effect of the network size ϕ is moderate, however, both measures ρ and m are very sensitive to the educational cost c, the share of high skill families h and the degree of skill homophily α.
Finally, we analyze the impact of the observation error K i on both measures of intergenerational mobility ρ and m. We achieve this goal by varying parameter z in the range [0..1]. Figure   8 shows that both types of families suffer from higher uncertainty when z is small and increasing, implying that fewer children obtain education. Yet, low skill families are stronger affected since their social networks are more asymmetric (lemma 1), so both measures ρ and m are increasing initially. However, differences in the available information disappear as z → ∞, meaning that both types of families become "infinitely uncertain" about the skill premium leading to a reversed effect of z on ρ and m for large values of z. For the limiting case we find that: The reason is that social networks of parents in our benchmark specification are rather large, so the estimates of the skill premium are relatively precise despite the presence of noise. types of families suffer from higher uncertainty when α and z are small and increasing, but the negative impact is stronger for low skill families since their networks are more asymmetric. In all considered cases this leads to an inverse U-shape relationship between the respective parameter and the two measures ρ and m. Quantitatively, the effect of variables z and ϕ is small to moderate, because social networks of both types of families are sufficiently large on average, but the impact of variables h and α is strong.

Ability transmission
In this section we extend the model by incorporating a positive correlation in the abilities of parents and children. The goal of developing this extension is to analyze the interaction between the pure ability driven correlation in schooling and the network driven correlation described above. In addition, in this section we investigate the fit of an overall model with both components to the empirical data.
Let variable ζ denote a binary indicator for the ability of the child, such that ζ = 1 for high ability children and ζ = 0 for low ability children. Further, p H > p denotes the probability that a child born in a high skill family has high ability and p L < p be the probability that a child born in low skill family has high ability. The fraction of high ability children is then E[ζ] = hp H + (1 − h)p L and the covariance between the skill level of the parent ξ and the ability of the child ζ is given by: Thus the covariance (and the correlation coefficient) between ξ and ζ are strongly driven by the difference p H − p L . Following the literature we attribute this correlation to the combined effect of nature and nurture, meaning that this correlation can be driven by the genetic transmission of abilities (nature), but can also stem from the more efficient help of high skill parents to their children (nurture).
Next consider the intergenerational correlation coefficient in schooling ρ in the extended model with ability and network components. The number of children acquiring education in high skill families is given by p H (1 − Φ H (c)) and the number of children in low skill families acquiring education is given by p L (1 − Φ L (c)). So the matrix of transition probabilities becomes: Thus the number of high skill children in the combined model is given byh = hp H (1 − Φ H (c))+(1−h)p L (1−Φ L (c))). Given this notation we can derive the intergenerational covaraince in schooling in the combined model in the following way: We can see that it is increasing in the number of children acquiring education in high skill families p H (1 − Φ H (c)) and decreasing in the number of children acquiring education in low skill families This equation shows that the intergenerational schooling covariance can be decomposed into (1) the ability component driven by the difference p H − p L , (2) the network component driven by the difference Φ L (c)−Φ H (c) and (3) the interaction term. We can see that the interaction term is always negative in the presence of a positive ability correlation because p L < p and p H > p. This means that stronger ability correlation crowds out the network effect. The intuitive explanation for this finding is the following: The network effect is largely driven by high ability children born in low skill families. These children are facing strong uncertainty about the market for high skill workers and are likely to make a mistake by not obtaining the education. However, as the ability correlation gets stronger, the number of high ability children born in low skill families is decreasing, so the number of children, who can potentially make a mistake by not studying, is also decreasing. This explains the negative sign of the interaction effect and the crowding out effect. We believe that this theoretical finding may have some implications for the empirical research, since any network/neighborhood effect estimated as a residual schooling covariance after controlling for ability transmission is likely to be underestimated due to the negative interaction term (see for example, Chetty et al. (2016) and Plug et al. (2018)).
Or alternatively, any ability effect estimated as a residual covariance after controlling for the network/neighborhood effect is likely to be underestimated for the same reason.
Next we consider the extended correlation coefficient and the impact of parameters p H , p L and h. The correlation coefficient is given by: Intuitively, this means that there are fewer children in low skill families obtaining education because: (1) the probability of being high ability for the child is lower and (2) To compare the fit of the restricted and unrestricted model to the data we split the sample into 15 birth cohorts. The first cohort comprises survey respondents who were between the ages 63 and 65 at the time of the survey whereas the last cohort includes respondents who were between the ages 21 and 23. The third and fourth columns in table 8 show the fractions of highskilled children and parents in each cohort. Similarly to the approach in section 3 we estimate an educational transition matrix per cohort which allows us to calculate the intergenerational correlation in education and the mobility ratio. The results are shown in the last two columns of table 8. We can see that the share of high skill parents increased substantially over the considered period from 0.135 to 0.390. Also the correlation coefficient increased from 0.292 to 0.375. At the same time we can see that the increase in the correlation coefficient was very pronounced in birth cohorts 1950 − 1970 but stagnated and started decreasing thereafter. We capture this relationship between h and ρ by estimating a quadratic polinomial, which gives rise  to the following regression:

Conclusion
In this paper, we develop a theoretical model to study the role of parental networks for the intergenerational educational mobility. Children observe the skill types and get noisy signals of wages from the social contacts of their parents. They use this information to form expectations about the market skill premium, which is otherwise unobservable. The schooling is modeled as a binary variable, hence children decide to become either high or low skilled based on their estimated return to schooling, inborn ability and the cost of education. We find that the skill premium estimated by children in low skill families is more dispersed, indicating higher uncertainty, compared to the estimates of children in high skill families. This happens if social networks exhibit skill homophily and high skill parents are a minority in the economy. As a result, children in low skill families are less likely to obtain higher education compared to children in high skill families giving rise to a positive intergenerational schooling correlation. The correlation is amplified if high skill wages are more dispersed than low skill wages. We view this mechanism as complementary to the traditional explanations described in the literature on intergenerational mobility. Further, the model addresses a conceptual issue that wages of parents and close family members and friends (strong ties) are typically observed with high precision, whereas wages of former colleagues and acquaintances (weak ties) are observed with lower precision. In this case expectation formation of the young generation suffers from the problem of heteroscedasticity. We prove that the optimal response of the youth to cope with this problem is to assign a higher weight to the precise observations of parental wages and lower weights to the remaining observations (weighted least squares), giving rise to the efficient estimates of the return to schooling.
We calibrate the model to match information about wages, social networks and schooling attainment observed in the German data (SOEP). Depending on the cost of education the new mechanism described by our model can contribute up to 15% of the observed intergenerational correlation coefficient. Based on the calibration we find that the correlation coefficient has low sensitivity to the degree of uncertainty about the return to schooling and the size of the social network, but it is very sensitive to the share of high skill parents in the economy and the degree of skill homophily. The data reveals that there is strong homophily by skill in the German society but it was stable over the last two decades, thus we direct our attention to the link between the share of high skill parents/families and the intergenerational correlation coefficient and find that the inverse U-shape relationship between these two variables predicted by the model is confirmed by the empirical data.
Finally, we extend the theoretical model and derive a novel decomposition of the intergenerational schooling covariance. More precisely, it is decomposed into the ability component, the network component and their interaction term. We find that the network effect is crowded out as the ability correlation is getting stronger. This theoretical result may be helpful in explaining the reason for relatively small network and neighborhood effects reported in a number of recent empirical studies (e.g. Chetty et al. (2016) and Plug et al. (2018)). Despite this finding, we show that the model with ability transmission and network effects provides a better fit to the data in the tails when the share of high skill parents/families is particularly large or small.

Appendix
Proof of Proposition 1.
After substituting the expressions for n LH , n LH , n HL , n HH : which is equivalent to: Thus we have to compare the ratio of weighted variances with the following expression: The denominator of G(s) can be rewritten as: −(1 − α)(ϕh + 1)s 2 + 2(1 − α)s + α which is a quadratic equation in s with one negative root and one positive, the latter is: s * = 1 ϕh + 1 1 + 1 + α(ϕh + 1) 1 − α Thus, function G(s) has a vertical asymptote at s * , which can be larger or smaller than 1. Next differentiate G(s) with respect to s: This means the following. For s < s * , function G(s) starts at a value G(0) = 1, is decreasing for higher values of s and converges to −∞ when s is converging to s * from below. In this range we know that G(s) ≤ 1, hence any value of the weighted variance ratio (V [w H i ]/V [w L i ])((1 − h)/h) above 1 would lead to the fact that the variance of the skill premium Δ L j is greater than that of Δ H j . This is the first sufficient condition. It is the only condition if s * > 1.
Next consider s * < s < 1, then function G(s) is decreasing from ∞ down to G(1) > 1, thus the relevant sufficient condition becomes: This is a second sufficient condition (by assumption h < 0.5 and n LH = ϕh(1 − α) > 1).
Proof of Lemma 2. The first order derivatives of V [Δ L ] and V [Δ H ] with respect to ϕ become: One can also observe that which is equivalent to Next differentiating V [Δ L ] and V [Δ H ] with respect to α become: Further, differentiating Φ k (c) (k = L, H) with respect to ϕ and α, we get: This means that in both types of families posses more precise information about skill premium and more children obtain education when the social network of parents exhibit less degree of homophily or the average size of the network increases. Otherwise, the education gap in both types of families is increasing when the social network exhibits strong degree of homophily or the average size of network is small. xyP (ξ = x, η = y) = 0 × 0 × P (ξ = 0, η = 0) + 1 × 0 × P (ξ = 1, η = 0) + 0 × 1 × P (ξ = 0, η = 1) + 1 × 1 × P (ξ = 1, η = 1) = hp(1 − Φ H (c)).
Then, cov[ξ, η] = hp(1 − Φ H (c)) − hh. Substituting inh, we obtain Proof of Lemma 6. In order to reduce the notation let the share of children obtaining education be denoted byp k = p k (1 − Φ k (c)), k = L, H, so the correlation coefficient ρ can be rewritten as: Note that ∂ρ/∂p H = (∂ρ/∂p H )/(1 − Φ H (c)). Differentiating ρ with respect top H gives: Consider the numerator of the expression in the square bracket, we denote it by F (p L ): This shows that F (p L ) is a parabola inp L opening to the bottom. Moreover, F (p L = 0) = hp H > 0 and F (p L = 1) = h(1 −p H ) > 0. We know that if a parabola is positive at two points and it is opening to the bottom, then is must be strictly positive in the intervalp L ∈ [0.