1. What is FluGenome?
FluGenome is a portal to accessing the lineage and genotype information of influenza A viruses and a Web tool for determining lineages and genotypes of influenza A viruses.
2. Who are we?
This project is a collaboration between the Centers for Disease Control and Prevention (CDC Influenza Division, Dr. Ruben Donis, Branch Chief MVVB) and the University of Nebraska at Omaha (UNO Department of Biology, Dr. G. Lu’s group). Rebecca Garten (from CDC) and Thaine Rowley (from UNO) are the main players in the implementation of this exciting project.
3. What does the genotype mean?
Generally speaking, the genotype is the genetic makeup encoded in an individual’s DNA. In a narrow sense, the genotype is formed with alleles of marker genes and presented in sequential order, e.g., AA.
4. What does influenza A viral genotype look like?
We proposed nomenclature for naming influenza A viral genotype as shown in the following figure, where a letter was assigned to each lineage of PB2, PB1, PA, NP, and M, and a number followed by a letter was assigned to each lineage of HA, NA, and NS with the number representing the subtype or allele. Example of a Genotype.
5. What does “Determine Genotype” mean?
It means that the user can submit genome sequences partial or complete for the determination of genotypes.
6. How to “Determine Genotype”?
Three steps are involved in determining genotypes with FluGenome. Go to. Determine Genotype. page, 1) choose either BLAST or MegaBLAST method; 2) set up filter options with coverage and identity values; 3) choose the number of genomes to compare.
The program determines the lineage of each viral gene segment first. The genotype is then constructed simply by the sequential incorporation of the lineages for each of the eight segments, arranged per convention as shown above.
7. What does “Determine Individual Gene Segment Lineage” mean?
It means that the user can submit gene segment sequences for the determination of lineages.
8. How to “Determine Individual Gene Segment Lineage”?
There are three steps involved in the determination. Go to. Determine Individual Gene Segment Lineage. page, 1) choose either BLAST or MegaBLAST method; 2) set up filter options with coverage and identity values; 3) choose the gene segment.
If a sequence is less than or equal to 35% of the maximum length, there is no result displayed, instead a message is shown, indicating that the sequence is too short to make an accurate prediction.
If a sequence is less than or equal to 75% but greater than 35%, the result is displayed, but with a message that the sequence is too short, and the prediction should be used with caution.
9. What is the Genotype database?
The Genotype database has everything relevant to the complete genomes and their genotypes. The database can be queried with a variety of options, such as segment lineages, genotypes, hosts, countries, serotypes, years, and text.
10. What is the Segment database?
The Segment database contains information related to sequences of individual segments. The segment database can be queried by segment, lineage, host, et al.
11. How to use sample data?
Two sets of sample data are available, genome sequences and gene segment sequences. You can use them to test the functions of “Determine Genotype” and “Determine Individual Gene Segment Lineage” with genome sequences and segment sequences respectively.
12. How are individual gene segment lineages determined?
There is no gold standard or set criteria in Influenza that can be applied for the determination of segment lineages. The first two assumptions were made; 1) All Influenza A gene segments have a common ancestor, 2) The HA, NA, and NS gene segments have serotypes or alleles while all other gene segments do not.
The lineages of each viral gene were carefully determined as detailed below: 1) using the phylogenetic trees constructed, significant clusters (which were segregated by approximately 10% nucleotide difference by p-distance) were assigned lineages 2) bootstrap analysis was used on a smaller set of sequences with values greater than 90% considered significant; 3).
The initial lineages were evaluated for nucleotide differences within and between other lineages and for the strength of bootstrap support; 4) approximately 10 sequences from each lineage were randomly selected for the Maximum Likelihood (ML) analysis for each gene segment, serotype (for HA, NA), or allele (for NS) on the MultiPhyl server.
13. How are genome re-assortments determined?
This is done by comparing the predicted genotype with a unique genotype database, illustrated in the figure below..
Example of a Triple Reassortment Event.
14. How often is the database updated?
The FluGenome database is updated daily.
15. What is a Genome ID?
A Genome ID is assigned to a strain that has a complete or nearly complete (>75% gene segment length) sequence for each of the eight influenzas A gene segments. Genome ID.s are created by matching 8 segments through the strain name. When a genome match is found, the sequences’ host, serotype, lineage, and length are checked against each other to make a positive match.
When a positive match is found, the host category is determined, and the abbreviation for that category is found. A number (the internal ID for the sequence) is used as a suffix to this abbreviation. This is now that genomes “full genome accession number.”