Glossary of Terms

Glossary of Sequence related Terms - (February 25th, 2024)

Many of the terms are over simplified here and readers are urged to google the term to get a better understanding and learn more. 



GenBank; The is the huge US government database for all sequences; not just fungi, but all life forms such as bacteria, plants and animals. Many of SSMC sequence have been uploaded to GenBank with more being added. An example;

https://www.ncbi.nlm.nih.gov/nucleotide/OP345842.1?report=genbank&log$=nucltop&blast_rank=1&RID=X21S4WER01N


BLAST; This is process within GenBank that compares your specific sequence to all other sequence in the GenBank database. It is a short term exercise that provides information and then goes away. BLAST result are always changing as new mushrooms are added to the database. 


BOLD Systems;  Database from Guelph University based in Ontario Canada that has a lot of sequences from Northern Europe that are not in GenBank. SSMC has a large number of sequences that were sequenced there and are in this database. A terrific resource.


MycoMap Beta;  Database owned by Hoosier Mushroom Society and managed by Stephen Russell. Just a bunch of MinION sequences going here. SSMC has a lot of sequence in this database some that are not in GenBank - yet. 


UNITE community;  Another database and search program based on the SH (species hypothesis) concept. At UNITE fungi are classified based strictly on ITS sequenced results. Largely European, but with a good inclusion of NA sequences as well. Great visual map for results.


FunDiS - Fungal Diversity Survey;  Previously was called Mycoflora Project and was originally designed to assist Community Scientist get their fungi sequenced. Today their mission has morphed a bit to try to enhance and protect rare mushrooms and their habitat.


ITS Test; The most common portion of the genome by far used for ID purposes. It is divided into three portions; ITS1, 5.8S and ITS2. There are other segments of the genome that can be looked at but ITS is by far the go to test for comparison purposes. The two ITS regions are non-coding.


Non-coding vrs. coding;  Coding portion of the genome have real life chores in a functioning organism while non-coding portions are “junk” mostly just along for the ride. A mutation in the non-coding regions (ITS1 and ITS2) has no dedicated purposes in maintaining life, and so a mutation here can stay around forever. While the non-coding areas have little or no purpose, they are valuable to us to determine taxonomy.


bp;  An ITS test sequence is generally 600 or 700 bp long, although longer and indeed shorter sequences are not uncommon. The bp is just shorthand for base pair. If all bp are the same between two mushrooms, the two are almost always the same species. If 3% of the bp do not match between two mushrooms, this is generally suspect of the two being different species. 3% is not a hard rule.


ACTG; Base pairs are designated as one of these 4 letters; the building blocks that direct all life. If there is a letter in your sequence results that is not one of these four, that generally means there is some confusion as to which letter it is actually meant to be. It may be a match or it may not be. 


FASTA file; The sequence you see that is made up of a solid string of six or seven hundred letters (ACT and G’s) make up the total ITS identity of any single mushroom and each result is the FASTA file of that mushroom. It is this file that is used in the databases and BLAST results that are mentioned above.


Barcode;  A method of specimen identification using short, standardized segments of DNA. Every species has its own barcode.The DNA barcode can be compared to a reference library to provide an ID. (Wikipedia)


Query Cover (QC);  Sequenced results of any two mushroom, even of the same species, can be different lengths. If they are the same length, that means their QC is 100%. A QC of less than 100% means the two sequences being compared are not the same length. If the QC is as low as say 70% or so that suggest one sequence is short enough it might be problematic. 


 Distance Tree;  In GenBank, and in DNA software programs, a tree can be built comparing different mushrooms putting them into groupings  (you get to pick which mushroom sequences you want the tree to consider). Different “branches” on a tree tend to contain mushrooms in the same species. If you build a tree and the mushrooms on a particular branch are named the same but with one outlier, that suggest the outlier is likely not ID’d correctly. 


Voucher Mushroom;  A bit fuzzy as to how to define depending on who is asking. In all cases though a voucher requires basic information on the mushroom; like habitat, date of collection and who found the mushroom and where. For SSMC it means at least an effort was made to get a sequence and get the mushroom to a herbarium. It also generally means there are good pictures of the live mushroom (before drying). 


OLY-1234 Voucher Number; This a unique number used to represent SSMC mushrooms that have been sequenced. The number requires that the tissue has been dried and sent in for sequencing. The number is tied to either an iNat number or Mushroom Observer number. While most of the SSMC sequences are successful, even for those sent in and not successfully sequenced they will get this unique number. 


WTU - Burke herbarium; One of the final resting spots for good quality voucher mushrooms in the PNW. Burke as a subset of the UW has been around forever, but with the new age of DNA results vouchered specimens now have a new much more reliable meaning. Additionally, vouchered mushrooms now require photographs be tied to the mushroom accepted into the Burke collection.

https://www.pnwherbaria.org/data/search.php


PNW Provisional Numbered Names;  This is a new “provisional naming organizational method” and while not officially accepted, this effort is increasingly being used to give meaning to mushrooms that do not have a species determination - yet. You will see these on Danny’s DNA Discoveries.


Danny’s DNA Discoveries;  A large portion of the sequence work in the PNW has been organized and made public by Danny Miller. I do not think I am overstating it to say he is a genius when it comes to understanding DNA as it relates to taxonomy of PNW mushrooms; Danny is tireless in getting the information online in a way that is valuable to not only scientist but also to non-scientist. Danny’s efforts can be seen here;

https://alpental.com/psms/ddd/index.htm


ALVALABS and Molecular Solutions;  Private sequencing labs that SSMC has used when a paid for sequence was needed. ALVALABS is in Spain and Molecular Solutions is in Portland Oregon. Both do fantastic work.  


Index Fungorum;  Is it a real species? Go Index Fungorum and click on “search” and type in the scientific name;

http://www.indexfungorum.org/


MinION;  The latest and greatest hope to bring down cost for sequencing. Small and almost portable it seems to be rapidly replacing the old Sanger sequencing method. You will see it identified on iNat and/or GenBank where it is listed as Oxford Nanopore MinION.