Glossary
Anchor markers—
Genes conserved across genomes that can be used in comparative mapping and phylogenetic analysis.
Annotation—
Adding pertinent information such as start/stop codons, intron/exon boundaries, 5' and 3' untranslated regions, gene coded for, amino acid sequence, or other commentary to the database entry of raw nucleotide sequence information.
Apomixis—
Asexual plant reproduction in which meiosis and fertilization are altered so that only one parent contributes genes to the offspring.
Bacterial artificial chromosomes (BAC)—
Vectors used to clone large (100–300 kb) inserts of genomic DNA that can be stably maintained in Escherichia coli cell cultures.
BAC end sequencing—
Sequencing of the ends of BAC clones (roughly 600 nucleotides), a process that is useful for creating a tiling pathway of BACs across a chromosome.
Bermuda rules—
The data release policy of the International Sequencing Consortium, deriving from the principle that sequence will be of greatest public benefit if freely available. Under the rules, assemblies of 1–2 kilobases are deposited in public data banks every 24 hours, and no patents are filed.
Bioinformatics—
The science of managing and analyzing biological data, including genomic research data, using advanced computing techniques.
Brassicaceae—
The mustard family. Members include Arabidopsis, canola, broccoli, rape, cabbage, kale, cauliflower.
Cis-acting elements—
DNA sequences in the vicinity of the structural portion of a gene that regulate gene expression.
cDNA (complementary DNA) libraries—
A collection of DNA clones representing a population of messenger RNA from which all non-coding, intron sequences have been removed.
Comparative Genomics—
The comparison of gene and genome structure, function and evolution across taxa.
Coverage—
representation of the accuracy of sequencing. For 5x coverage, a given base has been examined, or “covered,” 5 times.
Diploid—
A state in which each type of chromosome is present as a pair of homologous chromosomes.
DNA (deoxyribonucleic acid)—
The fundamental molecule encoding genetic information. DNA is a double-stranded molecule held together by weak bonds between base pairs of nucleotides. The four nucleotides in DNA contain the bases adenine (A), guanine (G), cytosine (C), and thymine (T).
DNA sequence—
The relative order of the nucleotide bases making up the DNA along the chromosomes.
Draft sequence—
The determined order of base pairs of a chromosomal area at a level of 4 to 5x coverage.
Evolutionary nodes—
Points of evolutionary divergence, representing ancient speciation events.
Expressed Sequence Tags (ESTs)—
The result of large-scale partial sequencing of randomly selected cDNA clones. ESTs are a useful tool for gene identification, localization, and mapping.
Fabaceae—
The legume family. Members include soybean, beans, cowpeas, peas, and alfalfa as well as numerous tropical trees.
Finished sequencing—
Additional sequencing needed to fill gaps, reduce ambiguities, and increasing sequence accuracy to no more than 1 error per 10,000 base pairs. The finished version will provide an estimated 8x to 9x coverage of each chromosome.
Fixed inbred lines—
Plant lines in which all loci are homozygous and, if selfed, breed true. Populations of fixed inbred lines may be developed from biparental crosses, i.e., Recombinant Inbred, (RI) lines or Doubled Haploid (DH) lines. These are considered immortal populations because each line retains its genetic integrity when selfed.
Forward genetics—
Identification of mutants followed by genetic crosses to locate the genes in which the mutations occurred.
Functional genomics—
The analysis of genes, their resulting proteins, and the role played by the proteins in an organism’s biochemical processes.
Gene-rich regions—
DNA sequences that contain a high percentage of coding sequences, and less than average amounts of non-coding DNA.
Genetic linkage—
The residing of genes closely together on the same chromosome arm. Linked genes tend to recombine less frequently than unlinked genes.
Genetic map—
A linear designation of the relative positions of genes on chromosomes and the distance between them, in linkage units, based on frequency of intergenic recombination.
Genome—
The entire complement of genetic material in a chromosome set.
Genomics—
The science and technology associated with the large-scale DNA sequencing of the complete set of chromosomes from a species and the interpretation of that sequence in terms of its organization, function and evolution.
Genomic DNA—
DNA containing both coding (exon) and noncoding (intron) sequences.
Genotype—
Genetic composition of an individual.
Heterochromatin—
Highly compacted DNA containing very few genes.
Heterosis—
The observation that in some circumstances, a hybrid offspring exhibits higher fitness (or productivity) than either of its parents.
Homeostasis—
The tendency of an organism or population to reach equilibrium and resist change.
Homologue—
A gene related to a second gene by descent from a common ancestral DNA sequence. The term may apply to the relationship between genes separated by the event of speciation or to the relationship between genes separated by the event of genetic duplication.
High-throughput sequencing—
A fast, bulk method of determining the order of bases in DNA.
Immortal mapping populations—
See Fixed inbred lines (above).
Library—
An unordered collection of related pieces of DNA (for example, cloned DNA from a particular organism) whose relationship to each other can be established by physical mapping.
Linkage disequilibrium—
A state in which alleles are inherited together more often than can be accounted for by chance, indicating that the two alleles are physically close on the DNA strand or that they are simultaneously the product of selection.
Loci (plural of locus)—
Positions of genes or other markers on a chromosome.
Messenger RNA (mRNA)—
Nucleotide segments carrying information from DNA and serving as a template for protein synthesis.
Microarrays—
Sets of miniaturized chemical reaction areas that may also be used to test DNA fragments, antibodies, or proteins. Microarrays can be used to monitor genetic diversity based on DNA-DNA hybridization, or to measure changes in gene expression, based on hybridization of a given mRNA population to cDNA embedded in a silica chip.
Model species—
A plant species that can serve as a unifying experimental model. The committee refers to Arabidopsis thaliana as the model plant species.
P1-derived artificial chromosome (PAC)—
Vectors based on a bacteriophage (a virus) PI genome used to clone large DNA fragments that can be stably maintained in Escherichia coli cell cultures.
Phenotype—
Detectable, outward manifestations of a given genotype.
Phylogeny—
Evolutionary relationships among organisms; the developmental history among organisms.
Physical map—
A map representing physical distances between genes and other markers (for example, restriction enzyme cutting sites), as measured in nucleotide base pairs.
Ploidy—
The number of chromosome sets in a given organism.
Poaceae—
The grass family. Members include rice, maize, sorghum, sugarcane, wheat, barley, oat, fescue.
Polymerase chain reaction (PCR)—
A technique used for amplification of specific DNA segments.
Polyploid—
An organism with more than two sets of a basic, or monoploid number of chromosomes, such as triploid, pentaploid, or hexaploid. Many plant genomes are polyploid.
Protein chip—
Microarray technology for protein profiling.
Quantitative trait loci (QTL)—
Genetic loci that affect a quantitatively inherited trait.
Reference species—
Plant species that serve as references for the species in major agronomically relevant plant taxa.
Reverse genetics—
Isolation and sequencing of a desired gene, and subsequent creation of mutations in it.
Serial analysis of gene expression (SAGE)—
A tool allowing analysis of overall gene expression.
Shotgun sequencing—
Sequencing method that involves randomly sequenced cloned pieces of the genome, in contrast to “directed” sequencing methods, in which pieces of DNA from known chromosomal locations are sequenced.
Single nucleotide polymorphisms (SNPs)—
DNA sequence variation that occurs when a single nucleotide (A, T, G, or C) is altered. SNPs can be useful in detecting genetic variation among individuals in a given population.
Solanaceae—
The nightshade family. Members include tomato, potato, tobacco, eggplant and petunia.
Synteny—
Linkage of genes along a chromosome. Conserved synteny refers to the conservation of gene order on chromosomes of different species.
Transcription—
The synthesis of an RNA copy from a sequence of DNA; the first step in gene expression.
Transcriptome—
The full complement of activated genes, transcripts, and mRNA in a given tissue at any particular time.
Transformation—
The process of integrating exogenous DNA into the genetic material of another organism.
Transposable elements—
Genetic elements that may “jump” to new locations, often disrupting the function of the genes into which they are inserted. These elements sometimes encode enzymes that synthesize an identical copy of the insertion into a new site.