next up previous contents
Next: Correction of the Length-Year Up: Relating the IBD Length Previous: Exponential Distributed IBD Lengths   Contents


Correction for the Assumptions of IBD Length Distributions

The theoretical lengths distribution was derived from sharing between two individuals. We consider IBD sharing among many individuals. The raw lengths were computed as the length of the maximal IBD sharing between any two individuals that possess the IBD segment. This resulted in overestimation of the lengths, because it is the maximum of all pairwise sharings. We corrected the IBD length as described below.

The raw IBD segment lengths are computed as the maximal IBD sharing of any two individuals that possess the IBD segment. The exponential distribution of IBD lengths as explained in Subsection 4.1 was derived for pairs of haplotypes (17,16,19,18). Thus, our raw IBD segment lengths are longer than the IBD lengths for pairs of haplotypes. To apply the assumption of exponentially distributed IBD segment lengths requires the correction of the raw IBD segment lengths. We observed a second cause for raw IBD segments being longer than expected by the exponential distribution. The more individuals share an IBD segment, the more likely it is to find two individuals that share random minor alleles which would falsely extend the IBD segment.

Therefore, we correct the raw lengths of IBD segments by locating the first tagSNV from the left (upstream) which is shared by 3/4 of the individuals that possess the IBD segment. This tagSNV is the left break point for the IBD segment. Analogously, we determine the right break point by the first tagSNV from the right (downstream) that is shared by 3/4 of the individuals. The distance between these break points is the (corrected) length of an IBD segment.


next up previous contents
Next: Correction of the Length-Year Up: Relating the IBD Length Previous: Exponential Distributed IBD Lengths   Contents
Sepp Hochreiter 2013-11-13