Integrated Taxonomic Information System - Taxamatch for Metadata

spacer image

ITIS Taxonomic Metadata Tool Use Guidelines

The ITIS Taxonomic Name Compare Tool can generate reports and SGML output for a component of the FGDC Biological Data Profile (FGDC-STD.001.1-1999). This allows users to quickly obtain the taxonomic hierarchy in SGML form for an unlimited number of scientific names. The SGML output can then be imported into metadata software or other applications.

To use the tool, follow these steps:

  1. Create a text document (file ending in .txt) with a list of scientific names only (without author, rank, etc.). This file should begin with the word "name", followed by one scientific name per line. You can only put names from one kingdom in each file.
    If your dataset has more than one kingdom, you can generate a report for each kingdom and append them in the SGML file. (See note below regarding duplicate names occurring in more than one kingdom.)
  2. Go to the ITIS Taxon Compare page and click on the browse button. Select the file you just created, then click the "Upload File" button. A message appears indicating the file upload was successful.
  3. Click OK to return to the previous page. The file name will now appear in the "File Name" text block.
  4. Verify the file name matches the file you want to compare against.
  5. If you have multiple columns in your text file (i.e. name, author, etc.), select the character you're using to separate the columns. You may use the pipe ( | ), tab, or the number sign ( # ).
  6. Click the "Next" button to continue.
  7. The next page begins with a "View Data File" button. You can click this to see a formatted version of your data file and ensure it is being parsed correctly by the tool. Click the "Back to Option Selection" button to return from this page.
  8. Select the kingdom contained in your file.
  9. In the next section, choose the "Scientific Name" selection
  10. In the Display options section, leave both boxes checked.
  11. Click the "FGDC Compare" button to start the comparison with ITIS. Note that the time to finish this comparison will vary depending on how many names are listed in your text file.
  12. The next page contains your match/non-match report, consisting of five possible sections:
    • Matches Between ITIS and Input File – Valid/Accepted Names
      This shows valid scientific names in your data that were successfully matched in ITIS, and shows the corresponding valid/accepted names in ITIS that match invalid/not accepted names in your data.
    • Matches Between ITIS and Input File – Invalid/Not Accepted Names
      This shows scientific names in your data and in ITIS that are invalid/not accepted, with their associated TSNs (Taxonomic Serial Numbers). It also shows the corresponding valid/accepted names in ITIS for these names. These invalid/not accepted names will be replaced by the accepted names in the SGML output.
    • Non-matches from Input File
      These are names that are not matched at all, either due to misspelling (a common problem) or another cause.
    • Duplicates in Input File
      This identifies scientific names that are listed more than once in your input file.
    • Duplicates Existing in ITIS
      This identifies scientific names that are listed more than once in the ITIS database. You can click on the TSN links to explore what the duplicates represent, then make the decision of which record to use for each case. Select the checkbox in the "Use" column once you've decided which scientific names to use (See notes below on duplicates).

    This report is your opportunity to check for any problems with your data. You may need to make changes to your text file, then repeat the compare steps to create a report you are satisfied is correct.

  13. Use the "Generate SGML" section to create an SGML export file when you are satisfied with the report content.
    • Download
      Click the "Download" button to immediately download the generated SGML file to your computer.
    • View
      Click the "View" button to see the generated SGML output. Note that the view page also has a "Download" button.


Note on Duplicates:

Duplicate occurrences of the same scientific name (in ITIS or in other sources), generally speaking, can either represent the same taxon, or different taxa. ITIS contains some duplicates representing the same taxon, due to different practices of ITIS' predecessor database (National Oceanographic Data Center's "N.O.D.C. Code") or accidental introduction into ITIS.

The ITIS policy for "same taxon" duplicate name cases is to render one as an invalid/not accepted "database artifact" (reflected in unacceptability_reason and comment_detail), and to link it in to the non-artifact record (or next to it if the non-artifact is itself invalid/unaccepted for other reasons). For example, see Acirsa borealis (Lyell, 1841) (TSNs 72352 & 72354). Such cases are an ongoing cleanup effort in ITIS.

Different taxon duplicate cases (homonymy, or the use of the same name to represent more than one taxon) can occur under several circumstances:

  1. The two names may be regulated by different Nomenclatural Codes (e.g., the International Code of Zoological Nomenclature, the International Code of Botanical Nomenclature, the International Code of Nomenclature of Bacteria, each of which regulates names at particular ranks in particular kingdoms), or not regulated by any such Code.

    Such cases are perfectly "legal" (though perhaps frowned upon). For example, Ficus Röding, 1798 (TSN 73159) is a valid mollusk genus, while Ficus L. (TSN 19081) is an accepted flowering plant (fig) genus. N.O.D.C. data practices created two additional copies of each genus record, and ITIS inherited those names (TSNs 73160 & 19082 respectively) and made them invalid - unavailable, database artifact; and not accepted - database artifact, respectively.

    In another example, Ctenophora is used as a valid animal phylum for comb jellies (Ctenophora Eschscholtz, 1829, TSN 53856) and a valid genus of crane fly (Ctenophora Meigen, 1803, TSN 118845). The rank of phylum is not regulated by a Nomenclatural Code, so there is no requirement to address this homonymy.

    Note that cross-kingdom homonymies can result in multiple different metadata listings for your matching name, so take care in appending multiple-kingdom outputs into the same file.

  2. Where names are regulated by the same Code, such homonymy ought not occur, but sometimes it does, and sometimes the two names can both be seen as valid/accepted! When authors catch such cases they can resolve them with a replacement name, but sometimes this has not yet happened.

    For example, Dendrocerus australicus (Dodd, 1914) is applied to two valid taxa within the same insect genus: TSN 611215 was originally described in the genus Lygocerus, and TSN 611214 was originally described in the genus Megaspilus. The homonym has not yet been resolved in the literature. They became secondary homonyms when the species were moved into Dendrocerus. Both are considered valid taxa, but no replacement name has been designated to resolve the homonymy. Such cases are extremely rare within a particular taxonomic group, but become somewhat more common between groups (mollusks vs. birds, etc.) or in very large or lesser-known groups.
spacer image
spacing image spacing image spacing image spacing image