Gold Genome

  • Genome backbone: 10x Pacbio Sequel or nanopore ultra long read data.
  • Genome error correction: 80-100x Illumina short read data (150x2bp reads). (Multiple insert size library: 500,800,1200,1800 pooled)
  • Scaffolding: 20 Kb Mate-pair Library + Filtered Ultra long reads (>15kb)
  • Pseudomolecules: GBS/ddRAD based biparental Linkage Mapping on 96 samples and further scaffolding of scaffolds into pseudomolecules.

Standard Bioinformatics includes:

  • Hybrid assembly
  • Scaffolding
  • Pseudomolecule generation.
  • Genome Annotation & Comparative Genomics
  • Gene prediction & annotation using: Ab-initio, homology based, cDNA based.
  • BLASTP alignment to KEGG, SwissProt, and TrEMBL databases.
  • Motifs and domains using InterProScan against protein databases, including ProDom, PRINTS, Pfam, SMART, PANTHER, and PROSITE.
  • non-coding RNA (ncRNA) prediction, rRNA, transfer RNAs (tRNAs), snRNA and microRNAs annotations.
  • Syntenic blocks and gene collinearity analysis with 6 major plant species. Synonymous (Ks) and non-synonymous (Ka) substitution rates for orthologous gene pairs.
  • Phylogenomics analysis for sequenced land races/ wild populations to identify selective sweeps. IBD analysis. LD analysis. Population structure analysis.