GENOME-WIDE SELECTION SIGNATURES IN PINZGAU CATTLE

The aim of this study was to identify the evidence of recent selection based on estimation of the integrated Haplotype Score (iHS), population differentiation index (FST) and characterize affected regions near QTL associated with traits under strong selection in Pinzgau cattle. In total 21 Austrian and 19 Slovak purebreed bulls genotyped with Illumina bovineHD and bovineSNP50 BeadChip were used to identify genomic regions under selection. Only autosomal loci with call rate higher than 90%, minor allele frequency higher than 0.01 and Hardy-Weinberg equlibrium limit of 0.001 were included in the subsequent analyses of selection sweeps presence. The final dataset was consisted from 30538 SNPs with 81.86 kb average adjacent SNPs spacing. The iHS score were averaged into non-overlapping 500 kb segments across the genome. The FST values were also plotted against genome position based on sliding windows approach and averaged over 8 consecutive SNPs. Based on integrated Haplotype Score evaluation only 7 regions with iHS score higher than 1.7 was found. The average iHS score observed for each adjacent syntenic regions indicated slight effect of recent selection in analysed group of Pinzgau bulls. The level of genetic differentiation between Austrian and Slovak bulls estimated based on FST index was low. Only 24% of FST values calculated for each SNP was greather than 0.01. By using sliding windows approach was found that 5% of analysed windows had higher value than 0.01. Our results indicated use of similar selection scheme in breeding programs of Slovak and Austrian Pinzgau bulls. The evidence for genome-wide association between signatures of selection and regions affecting complex traits such as milk production was insignificant, because the loci in segments identified as affected by selection were very distant from each other. Identification of genomic regions that may be under pressure of selection for phenotypic traits to better understanding of the relationship between genotype and phenotype is one of the challenges for livestock genetics.


INTRODUCTION
Genome-wide screening of single nucleotide polymorphisms (SNPs) can improve the understanding of the connection between genotype and phenotype changes resulting from the formation of modern livestock breeds.The analysis of a large number of SNPs across the genome will reveal aspects of the population genetic structure, incuding evidence of adaptive selection across the genome ( Barendse et al., 2009).Variations identified within the genome of cattle breeds have been primarily caused by human selection during the processes of domestication and subsequent breed formation.Domestication greatly changed the morphological and behavioral characteristics of cattle and with breed formation and selection programmes for improving the production traits allowed the formation of very diverse breeds (De Simoni Gouveia et al., 2014).
The explanation and identification of selection signatures can provide not only basic knowledge about evolutionary changes which shaped the genome but also can be very perspective for identifying domestication-related loci that ultimately may help to further genetically improve of economically important traits (De Simoni Gouveia et al., 2014; Qanbari et al., 2014).Much of the variation across the genetically diverse ancestral population was either lost due to limited numbers of animals within the areas of domestication or was divided into the subpopulations that were later recognized as distinct breeds.The strong selection to fix favourable mutations underlying domestication and formation of each breed created selective sweeps in which the variation was also lost (Ramey et al., 2013).If the mutation was recent and the selection is strong all alleles under positive selection will increase in frequency by producting selective sweep or selection signature.For neutral mutation, this will take many generations until the mutated allele has reached a high population frequency through drift.Where the loci selection is slight or the mutation is old little evidence of this selection may be left in the genome (Qanbari et al., 2010a;Kemper et al., 2014).
The evaluation of genes underlying phenotypic variation can be prepared based on two approaches: firstly from phenotype to genome that is carried out by linkage disequilibrium based association mapping and may involve positional cloning of QTL or by targeting particular candidate genes identified based on homology to known genes and secondly from genome to phenotype that includes the statistical estimation of genomic data to identify likely targets of past selection.The elimination of standing variation in regions linked to a recently fixed The aim of our study was to identify the signatures of strong and recent selection based on estimation of the integrated Haplotype Score and population differentiation index and characterise genomic regions which have been subjected to selective sweeps.

MATERIAL AND METHODOLOGY
In this study three data sets of animals were used to detect signatures of recent selection in Pinzgau cattle.In total 21 Austrian and 19 Slovak purebreed bulls registered by their breed association were genotyed using Illumina BovineHD and Illumina BovineSNP50 BeadChip.Dataset of 21 Austrian sires was builded with the aim of having common ancestors or being related to Slovak ones.The detailed description of sample size and data source for each set can be found in Table 1.The 36393 SNPs common to applied Illumina genotyping arrays were retained in reduced panel of SNPs.Markers assigned to unmaped regions or with unknown chromosomal position according to the latest bovine genome assembly (Btau 4.0) and SNPs positioned to sex chromosomes were removed (634).Quality control of data was carried out according to

Purcell et al. (2007).
Autosomal loci with call rate <90%, minor allele frequency <0.01 and Hardy-Weinberg equlibrium limit of 0.001 were excluded from subsequent analyses (5221).The evidence of positive selection was evaluated based on two approaches: integrated Haplotype Score (iHS) statistic and Wright's fixation index (F ST ) measure.
The analysis of selection sweeps using iHS statistic is based on haplotype frequencies as specified Voight et al. (2006).The haplotypes were reconstructed for each autosome using default parameters according to Scheet and Stephens (2006).The iHS statistic evaluate the extent of local linkage disequlibrium which is partitioned into haplotypes positioned upon and loci that carry the ancestral versus the derived allele.The iHS score reflects the structure of haplotype and essentially indicates unusually long haplotypes carring the ancestrall and derived allele (Qanbari et al., 2011).The set of ancestrall alleles resulting from research of Matukumalli et al. (2009) was used in our study.In iHS statistic each loci is treated as core SNP and the test begins with calculation of extended Haplotype Homozygosity for each core SNP.If SNPs are biallelic loci, then each core SNP can be ancestral or derived.This integrated EHH (iHH) (summed over both directions away from the core SNP) is denoted iHH A or iHH D , depending on whether it is computed for the ancestral or derived core allele The negative iHS values indicate greather homozygosity outlying the ancestral allele and positive values denote greater homozygosity outlying the derived allele.The iHS within analysed population was evaluated using the rehh package that is incorporated in R software (Gautier and Vitalis, 2012).Subsequently, the iHS values were averaged in genome-wide non-overlapping 500 kb windows.

RESULTS
The dataset consisting of the total 30538 autosomal SNPs that passed the filtering criteria have been used to identify genomic regions in Pinzgau bulls that may be influenced by recent selection.This subset of loci covered 25084.85Mbp of the genome with 81.86 kb average adjacent SNPs spacing.The distribution of minor allele frequency (MAF) across the panel of loci was not uniform (Figure 1 Two approaches were used for evidence of recent selection.Firstly, the iHS statistic was applied on dataset to detect selection sweeps.The iHS score was calculated for each SNPs and then averaged into non-overlapping 500 kb segments across the genome.The size of sliding windows was chosen based on sufficient number of SNPs for each segment.Genomic regions were considered as recently selected when the iHS score of multiple loci located within 0.5 Mb was greater than 1.7.In total   Secondly, to estimate genome-wide pattern of positive selection within evaluated population of Pinzgau bulls the F ST index for each SNP was calculated.The level of genomic differentiation was evaluated between two groups based on their origin which can lead to increase of allele frequencies in loci that were potentially affected by positive selection.The higher allele frequencies of these loci can be representative to the differentiation in balancing or directional selection, neutrality or other processes that were used in breeding programs of Austrian and Slovak Pinzgau cattle populations. Theoretically, the F ST values varied from 0 to 1, when both extremes means the total identity (F ST =0) or differentiation (F ST =1) within analysed populations.The selection signatures could be recognized when adjacent loci all show high F ST due to hitch-hiking effect resulting from divergent selection or when adjacent SNPs all show low F ST resulting from the balancing selection between populations (Qanbari et al., 2011).In our study the autosomal regions were recognized as affected by positive selection when the adjacent SNPs showed F ST values Figure 2 Genome-wide plot of the iHS score averaged for 500 kb.  3 only few loci had a tendency to cluster into similar region.The slight signals of positive selection was found in 6 regions localized on chromosomes 1, 2, 4, 5, 10 and 28 (Table 3).
In the next step of F ST estimation the values were averaged over 8-wide SNPs widows within each autosome to determine global pattern of F ST across genome.More than 95% of clusters across all autosomes showed F ST values lower than 0.01.The results indicated unimodal distribution and pretty much uniform scheme of selection in all loci included in analysis.The low level of genetic differentiation between Austrian and Pinzgau bulls detected based on F ST index is also apparent from the similar oirgin and common ancestrors in pedigree data of the analysed Pinzgau bulls.

DISCUSSION
The level of genetic variation among cattle population is a result of both neutral demographic processes, weak but sustained natural selection and strong short-term artificial selection for divergent breeding goals (Qanbari et al., 2011).In our study we used the genome-wide SNP data to detect the evidence of positive selection signals based on the iHS and F ST scan in Pinzgau bulls originating from Austria and Slovakia.The results from both analyses indicated low level of genomic differentiation between analysed groups.Based on the iHS statistic seven regions that can be potentially affected by recent selection were detected.The total numer of SNPs for each region was low.Only one of identified segments was found in genome region associated in previous studies with bovine quantitative trait locus (Table 2).The F ST values averaged in 8-wide windows indicated that the selection programs of Slovak and Austrian bulls are similar and therefore the genomic regions showed only very small differences.
Both of the applied statistics were used successfully in the evaluation of selection signatures in different cattle The genome-wide scan for evidence of selection signatures in livestock is one of the many approaches that are available for estimation of genomic diversity due to development of high throughput SNP genotyping arrays.The large observed datasets with high SNPs density provide a much better insight into the biological processes underlying natural and artificial selection of animals.The results of these studies can provide the valuable data for increase of animal selection strategies efficiency and also may help to understanding of biological limits and signals resulting from the high selection pressure of the achievement of breeding goals.

CONCLUSION
The genome-wide scan based on estimation of positive selection signatures using iHS and F ST statistics led to the detection of few regions that were affected by recent selection in Pinzgau cattle.Our results indicated use of similar selection programs in Slovak and Austrian Pinzgau populations.The conditions that would result in a clear evidence of selection signatures were rare.The response to the selection resulted from the small allele frequency changes in many loci that were polymorphic before start of selection in population.The results show low level of genetic differentiation or high genetic relatedness between analysed Austrian and Slovak Pinzgau bulls what is due to the fact that bulls from both populations had common ancestors.Pinzgau cattle are recognized as producer of food resources of specific quality due to its mountaineous origin.Observed results confirmed previously stated assumption of common genetic pool of Slovak and Austrian populations and indicated importance of both populations in preservation and utilization of Pinzgau cattle.
(Qanbari et al., 2011; De Simoni Gouveia et al., 2014).According Voight et al. (2006) the iHS score is described as within population score for the ratio between iHH A and iHH D : The genome-wide pattern of selection signatures were estimated also by calculating the basic form of Wright's F ST fixation index corrected by Weir and Cockerham (1984) at each syntenic locus and visulalised using SNP & Variation Suite v8.x (Golden Helix, Inc., Bozeman, MT, www.goldenhelix.com).The F ST index describing the degree of genetic differentiation between subpopulations can theoretically range from 0 to 1, but it is also possible to assume negative values (Akey et al., 2002).Selection signatures can be recognized when adjacent SNPs all show high F ST (Weir et al., 2005), due to the hitch-hiking effect (Maynard-Smith et al., 1974), implying divergent selection between breeds, or where adjacent SNPs all show low F ST , implying balancing selection between breeds.Smoothing, where a moving average of a certain number of markers is taken, is a method of looking for regions where selection is apparent over multiple markers, rather than one-off high values (Barendse et al., 2009; Moradi et al., 2012).The F ST values were also evaluated against genome position based on sliding windows approach and averaged over 8 consecutive SNPs.

Figure 1
Figure 1 Distribution of MAF across genome.
Figure 2 displays the genomewide plot of iHS values against the genomic position.The observed segments were localized close to the different genes.The average value of iHS score was 0.05 and the highest score (2.24) was identified for region on chromosome 7 with only one observed locus.In this genome location was found the presence of gene encoding DDB1 and CUL4 associated factor 15. Most of SNPs that showed significant iHS values were located on chromosme 15 in genomic region ranged from 52.03 to 52.47 Mb.In this bovine autosomal region is located the nuclear mitotic apparatus protein 1 (NUMA1) gene which is gene conserved across different species including human.The NUMA1 gene was tested for evidence of its role in proliferative activity and meiotic cell division (Taimen et al., 2004).The region on chromosome 18 consisting of only 2 loci was near to the FTO (the fat mass and obesity-associated gene) gene which was significantly associated with carcass traits and meat quality in cattle and pigs (Zhang et al., 2011; Dvořáková et al., 2012).However, the distribution of segments with clustered loci was across autosome non-uniform.The values of iHS score across autosome segments indicated that the analysed regions which can be affected by recent selection showed no major overlap.The slight signals of recent selection can be caused by the smaller sample size or mainly by the fact that the analysed individuals were genetically related.
populations.Qanbari et al. (2010b) found in population of German Holstein-Friesian cattle segment with an outlier value on chromosome 18 that contains the Sialic acid binding Ig-like lectin 5 gene and the Zink finger protein 577 gene.These genes are considered as candidate for calving ease, longevity and total merit index in Holstein cattle (Cole et al., 2009).Gu et al. (2009) reported the positive selection signatures in the genomic region surrounding muscle related genes.The evidence of strong selection in the region near to the growth hormone gene located on chromosome 20 was found by Flori et al. (2009) in Angus and by Hayes et al. (2009) in Holstein breeds.Simianer et al. (2010) reported the outlier F ST windows for the two regions on chromosome 2 and 5 in the vicinity of ZRANB3, R3HDM1 and WIF1 genes which are known as genes affecting feed efficiency and mammalian mesoderm segregation.In studies of three French dairy cattle breeds published by Van Tassell et al. (2008) and Karim et al. (2011) was found the differentiations in the region located on chromosome 18 which was associated with coat color (MC1R gene) and in the segment on chromosome 14 harbouring the PLAG1 gene that is important for the cattle growth.The platelet-derived growth factor alpha polypeptide (PDGFA) was identified as a potential candidate gene underlying the selective sweep on BTA25 in Simmental cattle and the receptor for this growth factor (PDGFRA) was identified as differentiated among the French dairy breeds (Ramey et al., 2013).

Figure 3
Figure 3 The distribution of F ST values within the autosomes.

Table 1
Description of used sample and genotyping array.

Table 2
Detected autosome segments identified as regions under selection.Table2showed summaries of the autosomal regions that displaying significant iHS values.Using iHS statistic was found only few SNPs that can be evaluated as loci under selection.Across genome only 7 windows exceeded the iHS value greater than 1.7.

Table 3
Summary of autosome regions identified as affected by positive selection.
higher than 0.20.The observed F ST values ranged from -0.05 to 0.28, with an average values of 0.0005.In total 76.23% of F ST values were lower than 0.01.The highest average F ST was found for BTA 4 (0.004).As indicated in figure