Nice follow up. I often make the same point. A note: I wouldn’t be too quick to dismiss that the Flynn effect taps a real increase. Sure, measurement is a bigger concern than for height. But surely, the fact that we have good evidence that some of the causal antecedents changed over time (nutrient deficiencies, general health, and education) as well counts for something. I think it can be squared with the problem that it cannot be true that people a hundred years ago were disabled if you consider specialization. We probably lost something on a few specific skills around navigation, knowledge of plants and crafts (not tapped well in standard tests) and gained in academics.
I'm agnostic about the Flynn effect because while there are good reasons to think that many IQ-destroying things went away (poor sanitation, malnutrition, lack of education), there was substantial dysgenic fertility over the whole period the Flynn effect was operating.
Other than the obvious reason (which, in the interests of civility, I'll assume isn't your motive) what makes you think racial gaps are any more “real” than intergenerational ones? The observation that kids with low scores were very different than whites in the same percentile was a key part of Jensen's theory.
The straightforward interpretation of that is that the white samples had much higher rates of genuine, biologically rooted disability. For no obvious scientific reason, hereditarians have either completey ignored the issue, or chooses to go along with Jensen’s more convoluted explaination.
It's worth keeping in mind that height is a composite measure comprising sitting height (trunk), limb length (legs and arms), and cranial height, and that research indicates that poor nutrition disproportionately impairs limb length due to its impact on long bone growth, particularly during critical developmental periods. For instance, chronic malnutrition may reduce leg length more than trunk or head dimensions, as long bones are highly sensitive to deficiencies in protein, calcium, or vitamin D. As a result, both the secular increase in height and IQ have parallel measurement issues.
To note, the concepts of biometric gene × environment interaction (G×E) and heritability × environment interactions are frequently conflated in the literature but represent distinct phenomena. G×E, a term in standard biometric models, quantifies the proportion of phenotypic variance due to interactions between genotypes and environments, where specific genotypes exhibit differential responses to environmental conditions. For example, one plant variety may thrive in fertile soil but perform poorly in arid conditions, while another shows the reverse pattern, with this variance attributed to G×E. Conversely, heritability × environment interactions describe how the proportion of phenotypic variance explained by genetic factors (heritability, h²) varies across environments, typically due to changes in total phenotypic variance. For instance, human height may have high heritability (h² = 0.8) in a stable environment, but in a malnutrition-prevalent environment, increased environmental variance may reduce heritability (e.g., h² = 0.5), even if genetic variance remains unchanged. In essence, G×E captures genotype-environment synergies, whereas heritability × environment interactions reflect shifts in the relative genetic contribution to phenotypic variance.
Super relevant follow up. I have heard people argue that those massive shocks like North/South Korea are categorically different from individual differences within modern populations. My counter has always been that the random, 100% environmental, effect of being 6th born vs 1st born in large Swedish families has an about 0.4 SD effect on IQ (that is the WITHIN family effect), So all else being equal, in a very egalitarian educational system, the slip in parental attention, time, resources can move cognition by a lot, see: https://www.sciencedirect.com/science/article/pii/S0160289615000045#f0010 For me this really brings the rGE mechanism back into play, I can imagine there could be child/parent, child/world phenotypes that elict attentional.resource benefits that are as large as, or larger than, the attentional/resource differences between sibling born 1st, 2nd 3th etc. BTW This: "One could compare the correlation between a height PGS extracted from archeological samples with femur length or skeleton length and compare its accuracy with a modern sample to check if this is true." has been done: https://pmc.ncbi.nlm.nih.gov/articles/PMC9298243/ .
Since I'm being called out a bit in this post, let me respond with a few additional points.
>> "On the other hand, he didn’t engage very much with the IMO very plausible argument that sib-regression and RDR miss some kind of important genetic variation"
I'm happy to engage with a specific critique of Sib-Reg/RDR methods but if the argument is that they miss "some kind of important genetic variation", what is there to engage with? I think we now both agree that Sib-Reg/RDR are quantitative genetic models that use random within-family segregation to control for environments (in slightly different ways) instead of making environmental assumptions. I think we also agree that these methods do not have the flaws that were ascribed to them in your original post (a "yes, but" -- https://www.astralcodexten.com/p/but-vs-yes-but -- would be appropriate here, by the way). The original post argued that quantitative methods like twin/pedigrees are solid and molecular methods like RDR are suspicious. Now that we've established that Sib-Reg/RDR operate exactly like quantitative methods but with fewer assumptions, we are suddenly supposed to treat them with suspicion too? Why? It is hard not to conclude that hereditarians simply prefer low quality analyses because such analyses produce appealing estimates.
>> "and the fact that the lack of similarity in adoptees makes strong shared environmental effects rather unlikely and this is hard to explain by some issue of the design."
This connects to the figure from Bingley et al. in your post and the figure from Eftedal et al. (https://www.pnas.org/doi/10.1073/pnas.2419627122) you link, which are clear points against the quoted claim. Bingley et al demonstrates that the effect of genes can be extremely low if you lift the equal environment assumption and much lower than what is estimated from a twin model *in the same exact cohort*. That means twin models *do not* line up with "all kinds of other pedigree studies" for the traits Bingley studied. Eftedal et al. demonstrate highly significant correlation of adopted siblings (and note that these are international adoptions, which will have an unusual shared environment by nature) and large correlations even for relatives of adopted siblings; another indicator of shared environment but this time on psychometrically valid IQ-like achievement measures. They conclude that estimates from the twin model cannot be reconciled with the level of assortative mating that is observed nor with the level of in-law resemblance nor with the adoptee correlations. A core finding from these two papers is that twin estimate *do not* generalize to broader relatedness classes, and yet this post repeatedly cites them as evidence in support of twin estimates. What am I missing, this just seems like a clear misreading of the cited studies?
>> "gene-environment interactions"
The post fixates on the Scarr-Rowe effect while missing a decade of follow-up work showing that amplification GxE is widespread in GWAS biobanks. I've cited multiple studies on this phenomena for education and IQ here (http://gusevlab.org/projects/hsq/#h.eujbeu4ca5ot) and written about other traits here (https://theinfinitesimal.substack.com/p/gene-environment-interactions-ubiquitous); this is a consistent finding across many different papers. In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES. For other traits, they start to explain a substantial amount of phenotypic variance in aggregate. Interestingly, these effects generally go in the *opposite* direction of the Scarr-Rowe phenomenon in twins; another weird result from twin studies that does not generalize to the broader population.
>> "this recent paper from the TEDS, which found a much more modest within-family molecular heritability of g at 0.3 but a decent one for education ... These molecular methods don’t just fail to replicate pedigree studies: they fail to replicate each other too"
This paper does not estimate heritability *at all*, you are misinterpreting the correlation of a polygenic score which is NOT a heritability estimate. The reported polygenic score effect is in line with what has been observed in other studies. Separately, the Sib-Reg paper you cite is the only study that has looked at IQ so far, so there is no other study to compare to. The estimate for educational attainment, which HAS been analyzed in other studies, is in line with other sib-reg analyses (though all of the sib-reg estimates still come with enormous uncertainty, which is why RDR is the superior method). So the entire argument here about methods failing to replicate each other is just empirically wrong and seems to be based largely on a misunderstanding of the TEDS paper.
---
TLDR: (1) Sib-Reg and RDR are strong methodological designs, yet neither the original post nor this follow-up articulates a mechanism by which they should provide significantly deflated estimates. (2) Twin-based estimates do not generalize to other relationship classes in multiple large registry analysis, yet those analyses are erroneously cited in this post as evidence in favor of twin estimates. (3) Amplification GxE within ordinary biobank populations is now well established across multiple studies, and has long been hypothesized as the reason twin studies produce inflated estimates (see Robinson et al. 2017 exploring this exact question for BMI: https://pubmed.ncbi.nlm.nih.gov/28692066/). (4) Purported inconsistency of molecular methods is based on a misreading of the results.
I understand why YOU think sib-regression/RDR are "exactly like quantitative methods" but there are obviously huge differences in the type of data they operate on. They are nice, elegant methods but if they fail to replicate standard quantitative estimates the latter are not automatically invalid, too many bells and whistles in sib-regression/RDR too. This has always been my position.
Yes, the lack of adoptee similarity is still a big reason I distrust low molecular heritabilities. It is misleading to call adoptee similarity in Eftedal et al "highly significant", it was ~r=0.15 vs r~0.5 for biological siblings, a difference that was mirrored in adopted vs. biological cousins which the authors also point out. Shared environmental effects on test scores of this magnitude would absolutely be expected. In the first post, I linked many adoption studies which demonstrate low (often zero in adulthood) similarity.
I don't think it's true that "twin estimate *do not* generalize to broader relatedness classes". I hate to quote the discussion section but Eftedal et al specifically take issue with Bingley-style very low estimates of genetic effects (Collado et al 2023) and emphasize that, actually, substantial heritability estimates can be obtained from non-twin relationships. I re-read Bingley carefully and although there is nothing glaringly wrong with that paper and I try not to quibble with specific modelling choices which I find pointless, I'm wondering how much work specific shared environmental paths by not just zygosity, but also sex composition is doing. As I'm illustrating in the post, similarity scales up nicely with genetic similarity even in distant relatives: for example, MZ uncle-nephew r=0.16, with DZ uncle 0.08, male cousins through MZs r=0.22, through DZs r=0.09. The extra similarity which is poorly modelled by genetic effects is mostly between the same generation (siblings and cousins), not cross-generation (parents and avuncular) so it also crossed my mind that this can be a period effect (e.g. all young people go to college more than their parents used to).
There are lots of extended pedigree studies, many collected e.g. here:
, which virtually all find much higher A than C effects. You may be right that twin estimates are inflated, but the reason people are interested in this debate is the point that genetic effects are much more important than shared environmental ones and I don't think this point has been meaningfully challenged.
Your post on GxE is very good. I wouldn't find it surprising for such effects to exist. But I disagree that "In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES", so Scarr-Rowe. This is not an ubiquitous finding, for example Rask-Andersen et al found the opposite in the UK Biobank (with molecular data): https://psychiatryonline.org/doi/10.1176/appi.ajp.2020.20040462
YES, BUT: you caught me committing a big error mentioning Lin et al as a molecular heritability paper when it's a PGS paper. I misremembered that SNP heritability was also reported and only looked at the tables when linking it. But it still is true that molecular heritabilities scatter a lot. For example, in the original Young et al paper the sib-regression heritability of education was 39.7% (SE=14.8, 32500 pairs), in Markel et al it is 7.6% (SE=9.5, 80000 pairs). Within the Markel paper, even for height and ignoring the very small WLS, the estimates go from 0.49 to 0.92. For comparison, in the height cohorts here: https://pismin.com/10.1375/twin.6.5.399, twin heritabilities range 0.76-0.87 for men and 0.68-0.85 for women. Some of this is just low precision but it's an issue to be considered.
So just to be clear. You are discarding the RDR results because they have "too many bells and whistles". You are plotting the results from Bingley et al. but ignoring their actual findings that genes can matter very little if there are unequal environments. You are citing the figure from Eftedal et al. to claim that twin estimates generalize to other relationships and adoptees while ignoring their key finding that twin estimates DO NOT generalize to other relationships and adoptees. I'm at a loss for what's actually being argued here. It seems like each claim is meant to be more about vibes than the actual cited findings, and then I guess we're supposed to stack these vibes together and draw a hereditarian conclusion.
By the way, "In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES" is not Scarr-Rowe, it is the *opposite*. And I was indeed referring to the results from Rask-Andersen (among other studies) and even included a figure from Rask-Andersen in the linked description (http://gusevlab.org/projects/hsq/#h.eujbeu4ca5ot). Again it is unclear to me what the disagreement is.
Great follow up, and great work steelmanning the anti-hereditarian position, especially with respect to Gusev's nonsense. But:
"Between MZ twins, every difference is automatically nonshared environmental: no difference in genes, no heritability. If they were real, Star Wars clone troopers probably had somewhat different personalities and intelligence: these are would be all environmental."
The Clone Troopers *do* have different personalities and intelligences (indeed, that's at the root of the story of the TV show Star Wars: The Clone Wars). This is in part due to (deliberately engineered) tiny genetic differences between them. (The "Bad Batch" are the shining examples of this.) The Clone Troopers aren't genetically identical, and for that matter neither are MZ twins (which means that heritability is in fact slightly underestimated).
"The Koreans are just one people with no plausible segregation of height genes"
Not exactly true. There is more Jurchen ancestry in North Korea than in South Korea, for one (see the book The Northern Region of Korea). Whether or not this affects the height genes across the peninsula is not clear, but it conceivably could.
"But within Europe, genetic distances are minimal, migration and intermarriage was frequent, so it is not plausible that genetic differences account for the quite large differences in development."
Nope. Allow me to introduce you a discovery of one of your co-ethnics, the Hajnal line. Heck, even within *Germany* today–a single country–genetic regional differences in development persist long after the fall of communism. Then there is the whole matter of persistent regional differences across the United States and Canada that can be traced to differences in the founding British population, depending upon the region of Britain they originated (the American Nations).
Thanks for correcting me on the clone trooper stuff, I wanted to be smart but I don't know enough Star Wars lore.
Yes, Korea is big enough to have population structure but I don't think this can be the cause of the height difference, because as the age-stratified data shows the height difference wasn't there before economic divergence happened.
I know the Hajnal line idea, which is why I brought up Czechia as an example, which is west of the Hajnal line and wasn't any less modern and developed than any other place in the Germanic realms before WW2/Communism. (I'm not sure how well known this is but interbellum Czechoslovakia was quite rich even by global standards, even with much less developed Slovakia and Subcarpathia attached.) I saw all the HBD Hajnal line maps so I know many other differential development patterns in Europe are long-standing, but IMO there is no convincing argument that these are genetic and not cultural. Because of the genetic closeness of populations (e.g. Austrians and neighboring former Communist countries) and because Eastern European immigrants do fine in the West I'm skeptical of a genetic explanation.
"Yes, Korea is big enough to have population structure but I don't think this can be the cause of the height difference, because as the age-stratified data shows the height difference wasn't there before economic divergence happened."
Imagine saying that about Europe (including the Dutch) before the 20th century.
There are north-south differences in height and other things in Japan, which isn't a whole lot larger than Korea.
"I saw all the HBD Hajnal line maps so I know many other differential development patterns in Europe are long-standing, but IMO there is no convincing argument that these are genetic and not cultural."
You should do a steelman set of arguments for how subracial ethnic differences are cultural and not genetic. You might notice something very interesting if you were to try...
"Because of the genetic closeness of populations (e.g. Austrians and neighboring former Communist countries)"
Northern Italy is very close genetically to Southern Italy yet there is a one standard deviation in average IQ between parts of the north and south–among other things.
"because Eastern European immigrants do fine in the West"
Selective migration–beyond not looking hard enough. Look at differences in attitudes and beliefs.
Correct me if I'm wrong but I think Northern Europeans and specifically the Dutch were always tall while South Koreans were not. I don't deny that there might be genetic height differences within Korea, it's just that the current ones are environmental because they appeared in just a few decades.
I also don't deny that there might be genetic subracial trait differences within Europeans, it's just that 1) these are not on IQ (because there is no phenotypic difference) and 2) there are demonstrated cases when formerly developed regions were semi-permanently stuck behind due to historical forces. Also, if Austrians are genetically hardwired to at least twice higher levels of development than Czechs or Hungarians then this the result of truly subtle genetic effects because the genetic differentiation of these populations is much smaller than the North-South Italy difference. (In the Nelis et al 2009 article Wikipedia uses for its within-Europe Fst matrix, Northern Italians are closer to Hungarians than Southern Italians and the Hungary-Austria difference is the smallest I can eyeball in the whole chart.)
"I also don't deny that there might be genetic subracial trait differences within Europeans, it's just that 1) these are not on IQ (because there is no phenotypic difference)"
It's not exactly true there are no IQ differences between Northwestern and Northeastern Europeans.
But there is much more to national development than average IQ, broadly the suite of psychological and behavioral traits that make up WEIRDness (e.g., trust/trustworthiness, openness to experience, imagination, rule-abiding/corruption/nepotism etc).
This:
"there are demonstrated cases when formerly developed regions were semi-permanently stuck behind due to historical forces."
...wasn't a historical accident. There is a reason innovation happened where it did (and continues to happen where it does).
"if Austrians are genetically hardwired to at least twice higher levels of development than Czechs or Hungarians then this the result of truly subtle genetic effects because the genetic differentiation of these populations is much smaller"
Genetic distance ≠ phenotypic distance. The suite of WEIRD traits substantially differs across Northern Europe going from West to East, even if average IQ stays roughly constant.
North Koreans are actually genetically taller than South Koreans. I estimate their genetic potential to be around 177 cm for men while Korea probably has a 175.5 cm ceiling.
I don't think the twin studies are wrong or that height (or IQ or anything else they say) is not strongly heritable! What I am reiterating is that heritability doesn't mean genetic determinism, it's just that genes are the principal causes of differences in one context (although a quite relevant one, within the same society at the same time). This is not a new idea, this has always been the definition of heritability.
What your example shows is that heritability is specific to a population. So if all the twin studies' participants are white Americans born in the 80s, you can find out that the heritability of height is 0.9 for that population, which shares an environment. But that doesn't say anything about how heritable height is in another population or how heritable it would be in a more diverse sample with more diverse environments. That's why twin studies can't distinguish between the effect of genes and environment, but always lump in a lot of GxE with the direct effect of genes, which is Stone's point.
A lot of hypotheses about environmental causes - many make it into policy - assume that the environment operates even within the limited within-country, within-era range. For example, sociologists and egalitarian politicians often assume that generational poverty or violence is due to each generation learning bad pattern from their parents so if we break the cycle once it is gone forever. Standard behavior genetic studies still make this look unlikely which is a big deal on its own.
My counter-examples are against an even more serious hereditarianism which assumes that genes are always the cause. I think you are correct to be skeptical about this yourself.
I'm not sure that those egalitarian politicians are wrong. American twin studies are conducted on overwhelmingly white, non-abusive, middle-income or better households, so we don't know how growing up poor and black in an abusive household affects traits. I don't think we do at least, but there may be work on that question, which it would be good if someone would write up.
Note also what I wrote about in the piece about the lack of a well-replicating Scarr-Rowe effect (growing up poor doesn't diminish genetic effects), and specifically the Pesta racial meta-analysis which shows that heritability is no lower in Blacks.
The question isn't whether heritability is lower or higher among blacks than whites, but whether heritability estimates would decrease if twins reared apart were exposed to a wider range of environments. Those national samples don't tell us anyting about the question, I don't believe, because the individual pairs of twins would have shared overwhelmingly the same environments as well as genes.
I found this paper which I think addresses the real problems of separating environmental from genetic effects. It tries to estimate the environmental effect of growing up in different countries on well-being by using international metrics of well-being and integrating them with what we know about twins. Of course, the same thing could be done with IQ, if you accepted that all or most of international variation in IQ was due to environment. You would certainly find that environment plays a vastly greater role in IQ than twin studies have been able to measure. https://journals.sagepub.com/doi/pdf/10.1177/17456916231178716
Nice follow up. I often make the same point. A note: I wouldn’t be too quick to dismiss that the Flynn effect taps a real increase. Sure, measurement is a bigger concern than for height. But surely, the fact that we have good evidence that some of the causal antecedents changed over time (nutrient deficiencies, general health, and education) as well counts for something. I think it can be squared with the problem that it cannot be true that people a hundred years ago were disabled if you consider specialization. We probably lost something on a few specific skills around navigation, knowledge of plants and crafts (not tapped well in standard tests) and gained in academics.
I'm agnostic about the Flynn effect because while there are good reasons to think that many IQ-destroying things went away (poor sanitation, malnutrition, lack of education), there was substantial dysgenic fertility over the whole period the Flynn effect was operating.
Other than the obvious reason (which, in the interests of civility, I'll assume isn't your motive) what makes you think racial gaps are any more “real” than intergenerational ones? The observation that kids with low scores were very different than whites in the same percentile was a key part of Jensen's theory.
The straightforward interpretation of that is that the white samples had much higher rates of genuine, biologically rooted disability. For no obvious scientific reason, hereditarians have either completey ignored the issue, or chooses to go along with Jensen’s more convoluted explaination.
It's worth keeping in mind that height is a composite measure comprising sitting height (trunk), limb length (legs and arms), and cranial height, and that research indicates that poor nutrition disproportionately impairs limb length due to its impact on long bone growth, particularly during critical developmental periods. For instance, chronic malnutrition may reduce leg length more than trunk or head dimensions, as long bones are highly sensitive to deficiencies in protein, calcium, or vitamin D. As a result, both the secular increase in height and IQ have parallel measurement issues.
This is another good point.
To note, the concepts of biometric gene × environment interaction (G×E) and heritability × environment interactions are frequently conflated in the literature but represent distinct phenomena. G×E, a term in standard biometric models, quantifies the proportion of phenotypic variance due to interactions between genotypes and environments, where specific genotypes exhibit differential responses to environmental conditions. For example, one plant variety may thrive in fertile soil but perform poorly in arid conditions, while another shows the reverse pattern, with this variance attributed to G×E. Conversely, heritability × environment interactions describe how the proportion of phenotypic variance explained by genetic factors (heritability, h²) varies across environments, typically due to changes in total phenotypic variance. For instance, human height may have high heritability (h² = 0.8) in a stable environment, but in a malnutrition-prevalent environment, increased environmental variance may reduce heritability (e.g., h² = 0.5), even if genetic variance remains unchanged. In essence, G×E captures genotype-environment synergies, whereas heritability × environment interactions reflect shifts in the relative genetic contribution to phenotypic variance.
This is a good point.
Super relevant follow up. I have heard people argue that those massive shocks like North/South Korea are categorically different from individual differences within modern populations. My counter has always been that the random, 100% environmental, effect of being 6th born vs 1st born in large Swedish families has an about 0.4 SD effect on IQ (that is the WITHIN family effect), So all else being equal, in a very egalitarian educational system, the slip in parental attention, time, resources can move cognition by a lot, see: https://www.sciencedirect.com/science/article/pii/S0160289615000045#f0010 For me this really brings the rGE mechanism back into play, I can imagine there could be child/parent, child/world phenotypes that elict attentional.resource benefits that are as large as, or larger than, the attentional/resource differences between sibling born 1st, 2nd 3th etc. BTW This: "One could compare the correlation between a height PGS extracted from archeological samples with femur length or skeleton length and compare its accuracy with a modern sample to check if this is true." has been done: https://pmc.ncbi.nlm.nih.gov/articles/PMC9298243/ .
Yes, and moving up the birth order because your older sibling dies makes you match your social, not your biological, position in the birth order: https://www.science.org/doi/10.1126/science.1141493
A limitation (if you can call it that) to these studies though is that in Swedish military exams there IS a quite substantial C effect for age 18 IQ so it's that surprising that family effects are still operating: https://www.sciencedirect.com/science/article/pii/S0160289614001676?via%3Dihub
It is very strange that in Sweden, despite a very egalitarian country, C effects are larger and more resilient to the Wilson effect than elsewhere.
Since I'm being called out a bit in this post, let me respond with a few additional points.
>> "On the other hand, he didn’t engage very much with the IMO very plausible argument that sib-regression and RDR miss some kind of important genetic variation"
I'm happy to engage with a specific critique of Sib-Reg/RDR methods but if the argument is that they miss "some kind of important genetic variation", what is there to engage with? I think we now both agree that Sib-Reg/RDR are quantitative genetic models that use random within-family segregation to control for environments (in slightly different ways) instead of making environmental assumptions. I think we also agree that these methods do not have the flaws that were ascribed to them in your original post (a "yes, but" -- https://www.astralcodexten.com/p/but-vs-yes-but -- would be appropriate here, by the way). The original post argued that quantitative methods like twin/pedigrees are solid and molecular methods like RDR are suspicious. Now that we've established that Sib-Reg/RDR operate exactly like quantitative methods but with fewer assumptions, we are suddenly supposed to treat them with suspicion too? Why? It is hard not to conclude that hereditarians simply prefer low quality analyses because such analyses produce appealing estimates.
>> "and the fact that the lack of similarity in adoptees makes strong shared environmental effects rather unlikely and this is hard to explain by some issue of the design."
This connects to the figure from Bingley et al. in your post and the figure from Eftedal et al. (https://www.pnas.org/doi/10.1073/pnas.2419627122) you link, which are clear points against the quoted claim. Bingley et al demonstrates that the effect of genes can be extremely low if you lift the equal environment assumption and much lower than what is estimated from a twin model *in the same exact cohort*. That means twin models *do not* line up with "all kinds of other pedigree studies" for the traits Bingley studied. Eftedal et al. demonstrate highly significant correlation of adopted siblings (and note that these are international adoptions, which will have an unusual shared environment by nature) and large correlations even for relatives of adopted siblings; another indicator of shared environment but this time on psychometrically valid IQ-like achievement measures. They conclude that estimates from the twin model cannot be reconciled with the level of assortative mating that is observed nor with the level of in-law resemblance nor with the adoptee correlations. A core finding from these two papers is that twin estimate *do not* generalize to broader relatedness classes, and yet this post repeatedly cites them as evidence in support of twin estimates. What am I missing, this just seems like a clear misreading of the cited studies?
>> "gene-environment interactions"
The post fixates on the Scarr-Rowe effect while missing a decade of follow-up work showing that amplification GxE is widespread in GWAS biobanks. I've cited multiple studies on this phenomena for education and IQ here (http://gusevlab.org/projects/hsq/#h.eujbeu4ca5ot) and written about other traits here (https://theinfinitesimal.substack.com/p/gene-environment-interactions-ubiquitous); this is a consistent finding across many different papers. In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES. For other traits, they start to explain a substantial amount of phenotypic variance in aggregate. Interestingly, these effects generally go in the *opposite* direction of the Scarr-Rowe phenomenon in twins; another weird result from twin studies that does not generalize to the broader population.
>> "this recent paper from the TEDS, which found a much more modest within-family molecular heritability of g at 0.3 but a decent one for education ... These molecular methods don’t just fail to replicate pedigree studies: they fail to replicate each other too"
This paper does not estimate heritability *at all*, you are misinterpreting the correlation of a polygenic score which is NOT a heritability estimate. The reported polygenic score effect is in line with what has been observed in other studies. Separately, the Sib-Reg paper you cite is the only study that has looked at IQ so far, so there is no other study to compare to. The estimate for educational attainment, which HAS been analyzed in other studies, is in line with other sib-reg analyses (though all of the sib-reg estimates still come with enormous uncertainty, which is why RDR is the superior method). So the entire argument here about methods failing to replicate each other is just empirically wrong and seems to be based largely on a misunderstanding of the TEDS paper.
---
TLDR: (1) Sib-Reg and RDR are strong methodological designs, yet neither the original post nor this follow-up articulates a mechanism by which they should provide significantly deflated estimates. (2) Twin-based estimates do not generalize to other relationship classes in multiple large registry analysis, yet those analyses are erroneously cited in this post as evidence in favor of twin estimates. (3) Amplification GxE within ordinary biobank populations is now well established across multiple studies, and has long been hypothesized as the reason twin studies produce inflated estimates (see Robinson et al. 2017 exploring this exact question for BMI: https://pubmed.ncbi.nlm.nih.gov/28692066/). (4) Purported inconsistency of molecular methods is based on a misreading of the results.
I understand why YOU think sib-regression/RDR are "exactly like quantitative methods" but there are obviously huge differences in the type of data they operate on. They are nice, elegant methods but if they fail to replicate standard quantitative estimates the latter are not automatically invalid, too many bells and whistles in sib-regression/RDR too. This has always been my position.
Yes, the lack of adoptee similarity is still a big reason I distrust low molecular heritabilities. It is misleading to call adoptee similarity in Eftedal et al "highly significant", it was ~r=0.15 vs r~0.5 for biological siblings, a difference that was mirrored in adopted vs. biological cousins which the authors also point out. Shared environmental effects on test scores of this magnitude would absolutely be expected. In the first post, I linked many adoption studies which demonstrate low (often zero in adulthood) similarity.
I don't think it's true that "twin estimate *do not* generalize to broader relatedness classes". I hate to quote the discussion section but Eftedal et al specifically take issue with Bingley-style very low estimates of genetic effects (Collado et al 2023) and emphasize that, actually, substantial heritability estimates can be obtained from non-twin relationships. I re-read Bingley carefully and although there is nothing glaringly wrong with that paper and I try not to quibble with specific modelling choices which I find pointless, I'm wondering how much work specific shared environmental paths by not just zygosity, but also sex composition is doing. As I'm illustrating in the post, similarity scales up nicely with genetic similarity even in distant relatives: for example, MZ uncle-nephew r=0.16, with DZ uncle 0.08, male cousins through MZs r=0.22, through DZs r=0.09. The extra similarity which is poorly modelled by genetic effects is mostly between the same generation (siblings and cousins), not cross-generation (parents and avuncular) so it also crossed my mind that this can be a period effect (e.g. all young people go to college more than their parents used to).
There are lots of extended pedigree studies, many collected e.g. here:
https://inquisitivebird.xyz/p/where-parents-make-a-difference
or this one: https://link.springer.com/article/10.1007/s10519-011-9507-9#libraryItemId=7226050
or this one: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-5907.2010.00461.x#libraryItemId=4516000
, which virtually all find much higher A than C effects. You may be right that twin estimates are inflated, but the reason people are interested in this debate is the point that genetic effects are much more important than shared environmental ones and I don't think this point has been meaningfully challenged.
Your post on GxE is very good. I wouldn't find it surprising for such effects to exist. But I disagree that "In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES", so Scarr-Rowe. This is not an ubiquitous finding, for example Rask-Andersen et al found the opposite in the UK Biobank (with molecular data): https://psychiatryonline.org/doi/10.1176/appi.ajp.2020.20040462
YES, BUT: you caught me committing a big error mentioning Lin et al as a molecular heritability paper when it's a PGS paper. I misremembered that SNP heritability was also reported and only looked at the tables when linking it. But it still is true that molecular heritabilities scatter a lot. For example, in the original Young et al paper the sib-regression heritability of education was 39.7% (SE=14.8, 32500 pairs), in Markel et al it is 7.6% (SE=9.5, 80000 pairs). Within the Markel paper, even for height and ignoring the very small WLS, the estimates go from 0.49 to 0.92. For comparison, in the height cohorts here: https://pismin.com/10.1375/twin.6.5.399, twin heritabilities range 0.76-0.87 for men and 0.68-0.85 for women. Some of this is just low precision but it's an issue to be considered.
So just to be clear. You are discarding the RDR results because they have "too many bells and whistles". You are plotting the results from Bingley et al. but ignoring their actual findings that genes can matter very little if there are unequal environments. You are citing the figure from Eftedal et al. to claim that twin estimates generalize to other relationships and adoptees while ignoring their key finding that twin estimates DO NOT generalize to other relationships and adoptees. I'm at a loss for what's actually being argued here. It seems like each claim is meant to be more about vibes than the actual cited findings, and then I guess we're supposed to stack these vibes together and draw a hereditarian conclusion.
By the way, "In the case of education/IQ they are substantial for socioeconomic status alone, with heritability doubling when going from high to low SES" is not Scarr-Rowe, it is the *opposite*. And I was indeed referring to the results from Rask-Andersen (among other studies) and even included a figure from Rask-Andersen in the linked description (http://gusevlab.org/projects/hsq/#h.eujbeu4ca5ot). Again it is unclear to me what the disagreement is.
Great follow up, and great work steelmanning the anti-hereditarian position, especially with respect to Gusev's nonsense. But:
"Between MZ twins, every difference is automatically nonshared environmental: no difference in genes, no heritability. If they were real, Star Wars clone troopers probably had somewhat different personalities and intelligence: these are would be all environmental."
The Clone Troopers *do* have different personalities and intelligences (indeed, that's at the root of the story of the TV show Star Wars: The Clone Wars). This is in part due to (deliberately engineered) tiny genetic differences between them. (The "Bad Batch" are the shining examples of this.) The Clone Troopers aren't genetically identical, and for that matter neither are MZ twins (which means that heritability is in fact slightly underestimated).
"The Koreans are just one people with no plausible segregation of height genes"
Not exactly true. There is more Jurchen ancestry in North Korea than in South Korea, for one (see the book The Northern Region of Korea). Whether or not this affects the height genes across the peninsula is not clear, but it conceivably could.
"But within Europe, genetic distances are minimal, migration and intermarriage was frequent, so it is not plausible that genetic differences account for the quite large differences in development."
Nope. Allow me to introduce you a discovery of one of your co-ethnics, the Hajnal line. Heck, even within *Germany* today–a single country–genetic regional differences in development persist long after the fall of communism. Then there is the whole matter of persistent regional differences across the United States and Canada that can be traced to differences in the founding British population, depending upon the region of Britain they originated (the American Nations).
Thanks for correcting me on the clone trooper stuff, I wanted to be smart but I don't know enough Star Wars lore.
Yes, Korea is big enough to have population structure but I don't think this can be the cause of the height difference, because as the age-stratified data shows the height difference wasn't there before economic divergence happened.
I know the Hajnal line idea, which is why I brought up Czechia as an example, which is west of the Hajnal line and wasn't any less modern and developed than any other place in the Germanic realms before WW2/Communism. (I'm not sure how well known this is but interbellum Czechoslovakia was quite rich even by global standards, even with much less developed Slovakia and Subcarpathia attached.) I saw all the HBD Hajnal line maps so I know many other differential development patterns in Europe are long-standing, but IMO there is no convincing argument that these are genetic and not cultural. Because of the genetic closeness of populations (e.g. Austrians and neighboring former Communist countries) and because Eastern European immigrants do fine in the West I'm skeptical of a genetic explanation.
"Yes, Korea is big enough to have population structure but I don't think this can be the cause of the height difference, because as the age-stratified data shows the height difference wasn't there before economic divergence happened."
Imagine saying that about Europe (including the Dutch) before the 20th century.
There are north-south differences in height and other things in Japan, which isn't a whole lot larger than Korea.
"I saw all the HBD Hajnal line maps so I know many other differential development patterns in Europe are long-standing, but IMO there is no convincing argument that these are genetic and not cultural."
You should do a steelman set of arguments for how subracial ethnic differences are cultural and not genetic. You might notice something very interesting if you were to try...
"Because of the genetic closeness of populations (e.g. Austrians and neighboring former Communist countries)"
Northern Italy is very close genetically to Southern Italy yet there is a one standard deviation in average IQ between parts of the north and south–among other things.
"because Eastern European immigrants do fine in the West"
Selective migration–beyond not looking hard enough. Look at differences in attitudes and beliefs.
Correct me if I'm wrong but I think Northern Europeans and specifically the Dutch were always tall while South Koreans were not. I don't deny that there might be genetic height differences within Korea, it's just that the current ones are environmental because they appeared in just a few decades.
I also don't deny that there might be genetic subracial trait differences within Europeans, it's just that 1) these are not on IQ (because there is no phenotypic difference) and 2) there are demonstrated cases when formerly developed regions were semi-permanently stuck behind due to historical forces. Also, if Austrians are genetically hardwired to at least twice higher levels of development than Czechs or Hungarians then this the result of truly subtle genetic effects because the genetic differentiation of these populations is much smaller than the North-South Italy difference. (In the Nelis et al 2009 article Wikipedia uses for its within-Europe Fst matrix, Northern Italians are closer to Hungarians than Southern Italians and the Hungary-Austria difference is the smallest I can eyeball in the whole chart.)
Before 1850 the Dutch were as short as Italians, shorter than other northern Europeans. Their height surged ahead during the 19th and 20th centuries.
https://randalolson.com/2014/06/23/why-the-dutch-are-so-tall/
"I also don't deny that there might be genetic subracial trait differences within Europeans, it's just that 1) these are not on IQ (because there is no phenotypic difference)"
It's not exactly true there are no IQ differences between Northwestern and Northeastern Europeans.
But there is much more to national development than average IQ, broadly the suite of psychological and behavioral traits that make up WEIRDness (e.g., trust/trustworthiness, openness to experience, imagination, rule-abiding/corruption/nepotism etc).
This:
"there are demonstrated cases when formerly developed regions were semi-permanently stuck behind due to historical forces."
...wasn't a historical accident. There is a reason innovation happened where it did (and continues to happen where it does).
"if Austrians are genetically hardwired to at least twice higher levels of development than Czechs or Hungarians then this the result of truly subtle genetic effects because the genetic differentiation of these populations is much smaller"
Genetic distance ≠ phenotypic distance. The suite of WEIRD traits substantially differs across Northern Europe going from West to East, even if average IQ stays roughly constant.
North Koreans are actually genetically taller than South Koreans. I estimate their genetic potential to be around 177 cm for men while Korea probably has a 175.5 cm ceiling.
Can I summarize your argument like this: tl;dr genes give you possibile phenotypes, environment gives you actual phenotype?
You give excellent examples showing why the twin studies estimate that height is very strongly heritable has to be wrong: people with the same genes growing in very different environments, like North and South Korea or the 21st and 19th centuries, are radically different in height. Lyman Stone has also been making this same kind of point: https://open.substack.com/pub/lymanstone/p/more-evidence-twin-studies-are-bad?r=4952v2&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false
I don't think the twin studies are wrong or that height (or IQ or anything else they say) is not strongly heritable! What I am reiterating is that heritability doesn't mean genetic determinism, it's just that genes are the principal causes of differences in one context (although a quite relevant one, within the same society at the same time). This is not a new idea, this has always been the definition of heritability.
What your example shows is that heritability is specific to a population. So if all the twin studies' participants are white Americans born in the 80s, you can find out that the heritability of height is 0.9 for that population, which shares an environment. But that doesn't say anything about how heritable height is in another population or how heritable it would be in a more diverse sample with more diverse environments. That's why twin studies can't distinguish between the effect of genes and environment, but always lump in a lot of GxE with the direct effect of genes, which is Stone's point.
A lot of hypotheses about environmental causes - many make it into policy - assume that the environment operates even within the limited within-country, within-era range. For example, sociologists and egalitarian politicians often assume that generational poverty or violence is due to each generation learning bad pattern from their parents so if we break the cycle once it is gone forever. Standard behavior genetic studies still make this look unlikely which is a big deal on its own.
My counter-examples are against an even more serious hereditarianism which assumes that genes are always the cause. I think you are correct to be skeptical about this yourself.
I'm not sure that those egalitarian politicians are wrong. American twin studies are conducted on overwhelmingly white, non-abusive, middle-income or better households, so we don't know how growing up poor and black in an abusive household affects traits. I don't think we do at least, but there may be work on that question, which it would be good if someone would write up.
Actually we do have full-population studies of IQ with no volunteer bias and the results are the same:
- Scottish Mental Survey (every kid had to participate, also old so everybody was poor by modern standards): https://link.springer.com/article/10.1007/s10519-005-3556-x
- Twins linked from Swedish mandatory conscription: https://www.sciencedirect.com/science/article/pii/S0160289614001676?via%3Dihub
- British and Dutch competency surveys every kid in school took: https://link.springer.com/article/10.1007/s10519-012-9549-7
- Heritability of diseases from insurance data: https://www.nature.com/articles/s41588-018-0313-7#article-info
Note also what I wrote about in the piece about the lack of a well-replicating Scarr-Rowe effect (growing up poor doesn't diminish genetic effects), and specifically the Pesta racial meta-analysis which shows that heritability is no lower in Blacks.
The question isn't whether heritability is lower or higher among blacks than whites, but whether heritability estimates would decrease if twins reared apart were exposed to a wider range of environments. Those national samples don't tell us anyting about the question, I don't believe, because the individual pairs of twins would have shared overwhelmingly the same environments as well as genes.
I found this paper which I think addresses the real problems of separating environmental from genetic effects. It tries to estimate the environmental effect of growing up in different countries on well-being by using international metrics of well-being and integrating them with what we know about twins. Of course, the same thing could be done with IQ, if you accepted that all or most of international variation in IQ was due to environment. You would certainly find that environment plays a vastly greater role in IQ than twin studies have been able to measure. https://journals.sagepub.com/doi/pdf/10.1177/17456916231178716