DNA Analysis

Specific tasks (DNA Analysis at NYU)

Isolate genomic DNA from tissues from each of 1000-1200 fin clips.
Identify at least 100 microsatellite loci from two monkfish DNA libraries that are enriched for two different microsatellite motifs using 454 technology to identify microsatellite and flanking primer sequences
Test a subset of 25 microsatellites for their reproducibility in PCR amplification
Evaluate the extent of allelic diversity at 25 microsatellites that reproducibly amplify in 5 specimens from each of 5 geographically or temporally distinct collections sites.
Characterize microsatellite variation in each of 1000-12000 specimens at 12 diagnostic loci
Statistically analyze DNA data for stock structure and migration rates
Prepare final report and manuscript and present results at appropriate management forums

Specific Methods

DNA Isolations
Fin clips will be the primary source of DNAs for this study. Secondarily, to analyze short-term temporal stability of allelic frequencies, we will analyze vertebra from frozen collections from NEFSC of NOAA fall and spring trawl surveys. Total DNA will be isolated from EtOH-preserved tissue or frozen vertebra by their incubation in CTAB buffer (Saghai-Maroof et al. 1984) and digestion with proteinase K, followed by standard phenol-chloroform extractions and alcohol precipitations. DNA concentrations and purities will be determined with a Nanodrop ND-1000 spectrophotometer. All DNAs will be diluted to a final concentration of 50 ng/μl for standardization in subsequent microsatellite PCR reactions.

Microsatellite analyses
Microsatellite analysis will be conducted at 12 Lophius americanus loci that will be specifically isolated for this study by a commercial vendor, Ecogenics (see accompanying quote), that specializes in the development of genomic DNA libraries from which microsatellites are identified using 454 next-generation-sequencing technology. We are aware that 8 polymorphic microsatellite loci have previously been isolated from European black anglerfish (Blanco et al. 2006), but we feel that it is worth the extra expenditure to isolate microsatellites specifically for our target species. It has been our experience that while microsatellite primers from closely related species sometimes reliably amplify DNA from target species that is not always the case and those loci that are highly conserved are often not the ones that exhibit the highest levels of allelic variation. Ecogenics promises a 2-month turnaround time on the identification of the requisite number of microsatellites. Ecogenics will develop 2 distinct libraries that will be highly enriched for 2 different microsatellite motifs, one a 2 base GT repeat and the second a 3-base repeat. The sequencing data will be screened by Ecogenics for microsatellite repeats and identify the sequences, amplicon lengths, and flanking single copy sequences of these repeats. Ecogenics will design primers from these sequences that will allow for easy multiplexing of loci in genotyping reactions. The number of microsatellite sequences identified by Ecogenics varies greatly among species, but on average about 300 loci suitable for primer design are identified.

At NYU, we will initially empirically test a subset of 25 microsatellites identified by Ecogenics for their reproducibility in PCR amplification and levels of genetic variation in monkfish. This initial test of genetic variation will be done to identify a subset of loci that will provide sufficient levels of polymorphisms to robustly test for population structure across our complete sample set. We will test these loci in 25 specimens-5 from each of 5 geographically distant collection sites. If needed, we will test more microsatellites for their suitability in subsequent population analysis. Our goal is to identify 12 informative loci to use in our complete population study.

PCR reactions in 20 μl reactions will contain 50 ng of template DNA, 1 x PCR buffer, 2 mM MgCl2, 0.25 mM dNTPs, 0.05 mM forward and reverse primers (of which one will be fluorescently labeled) and 0.1 U Taq DNA polymerase. PCRs will be either multiplexed or done as single reactions and then subsequently pooled prior to analysis. Characterization of microsatellite genotypes will be done at no charge to this project on a Beckman Coulter CEQTM8000 capillary-based DNA sequencer housed across the hall from Wirgin's lab in the NYU NIEHS Molecular Facilities Core of which Wirgin is Co-Director. Multiplexed PCR reactions will be diluted up to 1:3 with Sample Loading Solution, 0.5 to 2 μl of diluted PCR reactions will be loaded onto 96 well plates along with 0.5 μl of CEQ DNA Size Standard-400, and 40 μl of Sample Loading Solution and run with the FRAG 1 program. MICRO-CHECKER (Oosterhout et al. 2004) will be used to test for the presence of null alleles, errors due to microsatellite stuttering, and large-allele dropout.

Statistical Analyses of Microsatellite Data

Multi-locus microsatellite nDNA genotypes will be compiled for each specimen. Measures of diversity, including mean number of alleles, allelic richness, FIS and observed and expected heterozygosities (HO and He) will be calculated using GDA (Lewis and Zaykin 2001) for each locus and population sampled. Deviations from Hardy-Weinberg proportions and linkage equilibrium will be tested with GENEPOP v4.0.6 (Rousset 2007) using the Markov chain method with 10,000 iterations and 10,000 batches (Raymond and Rousset, 1995). Allelic frequency data will be examined for heterozygote deficit using the Score (U) Test developed for multiple sample analysis by Rousset and Raymond (1995). Locus specific FST values and Fisher's exact tests of allelic differentiation will be calculated in FSTAT version 2.9.3 (Goudet et al. 1995; Goudet 2001) and GENEPOP. Pairwise FST and RST comparisons will be done at single loci and across all loci using the FST estimator θ of Weir and Cockerham (1984). Bonferroni adjustments will be applied to the Pvalues generated from all component tests.

The statistical power and realized α-error for assessing the null hypothesis of genetic homogeneity within and across sample collections will be evaluated using POWSIM (Ryman and Palm 2006).

FST values for the pooled dataset will also be used in a principal component analysis (PCA) using GENALEX 6.1 (Peakall and Smouse 2006) to visualize the clustering of populations. A posteriori analysis of population clusters identified by FST and PCA will be conducted using a hierarchical approach in AMOVA and implemented in ARLEQUIN 2.0 (Schneider et al. 2000) with 10,000 permutations to compare genetic variation within and among clusters.

We will also analyze the data on an individual basis without a priori designation of populations as an exploration of population structure using two programs STRUCTURE v.2.3 (Pritchard et al. 2000; Falush et al 2005; Pritchard et al. 2010) and BAPS v5.1 (Corander et al. 2006). These Bayesian clustering approaches define population units by iteratively sorting individual genotypes into clusters to maximize the fit of the data to theoretical expectations derived from Hardy-Weinberg and linkage equilibrium. Their use will enable us to infer the number of genetically homogenous clusters within our samples and allow assignment of individuals to designated genetic clusters. For STRUCTURE, we will use the admixture model and correlated allelic frequencies among collections. Both the plateau of likelihood values and Δ K (Evanno et al. 2005) will be estimated. For BAPS, the mixture model will be applied to cluster groups of individuals based on their multi-loci genotypes and determine the optimal number of genetically homogeneous groups. Admixture analysis will then be used to estimate individual admixture proportions in regards to the most likely number of clusters identified. When combined with results from these assignment tests, rates of dispersal will be estimated by identifying migrant individuals among the populations it so defined. We will also use the coalescence-based MIGRATE program to estimate gene flow among genetically differentiated collections or individual-based clusters (Beerli 2008).

To evaluate an isolation by distance (IBD) model, we will use Mantel's tests implemented in GENALEX to test for a correlation of genetic distance (FST) and shortest ocean distance between sample locations. IBD will be evaluated across all spawning collections overall and within the identified clusters. Genetic barriers will be identified using the program BARRIER v2.2 (Manni et al. 2004). This approach uses the Monmonier algorithm to identify geographic areas associated with genetic discontinuities among spatially connected populations. We will create BARRIER maps from the geographic coordinates of the adult sample locations provided by fishermen and government agency surveys. We will conduct the analyses using two genetic distances: 1) bootstrapping over 100 matrices of Nei's standard genetic distance D (Nei 1987; calculated in the program MSA; Dieringer and Schlötterer 2003); 2) and with 12 FST matrices for each of the loci separately. A combination of the two approaches will enable us to evaluate the strength of the barriers and contributions of the different markers.

Contact

Tara McClintock
Fisheries Specialist
taf4@cornell.edu
631-727-7850 x 11317

Last updated July 26, 2019

Cornell Cooperative Extension

Suffolk County