spacer image spacer image spacer image spacer image
The ITIS Logo The banner for the About ITIS  page spacer image
spacer image

Integrated Taxonomic Information System
National Museum of Natural History
Washington, D.C.

Data Development History and Data Quality

The Integrated Taxonomic Information System (ITIS) was established in the mid-1990s as a cooperative project among several federal agencies to improve and expand upon taxonomic data (known as the NODC Taxonomic Code[1]) maintained by the National Oceanographic Data Center (NODC), National Oceanic and Atmospheric Administration (NOAA). ITIS inherited approximately 210,000 scientific names with varying levels of data quality from the NODC data set. While many important taxonomic groups were not well represented (e.g., terrestrial insects), the rate of errors and omissions within represented taxonomic groups ranged from relatively low (e.g., few misspellings or occasional typographical errors) to rather high (e.g., many species names without authors or dates, or species assigned to wrong groups).

The ITIS mission is to create a scientifically credible database of taxonomic information, placing primary focus on taxa of interest to North America, with world treatments included, as available. Within this framework, the initial data content development and quality assurance strategy was to begin with the NODC data and proceed on two tracks: (1) adding new names or checklists with a high level of taxonomic credibility, and (2) reviewing and verifying the legacy NODC data, thereby bringing it to a minimal, or higher, standard of data quality. Pending review and improvement, the unverified legacy data have been retained in the ITIS database to meet the needs of ITIS partners and cooperators who use the names and their associated unique identifiers (Taxonomic Serial Numbers - TSNs) in specific applications. Since the 1996 import of the legacy dataset, ITIS has grown to nearly 656,000 scientific names, more than 85% of which have been verified in the literature, leaving about 97,000 names as unverified legacy data.[2]

Although the ITIS database initially was populated with names derived from the NODC Taxonomic Code, it is being expanded to link individual names to one or more credible reference(s) (e.g., print publications, recognized experts, databases). (Those references may or may not also be linked to other names contained within the group, i.e., a family name may be linked to a publication that was used to verify its status and position, but that publication might not reference the subordinate genera or species within the family). This process of verification based on credible references is at the core of ITIS activities.

Depending on the rank of the scientific name (e.g., kingdom, family, subspecies, etc.), each ITIS name record has one to three data quality indicators[3] associated with it:

  • Record Credibility Rating
  • Latest Record Review
  • Global Species Completeness

Every scientific name record in ITIS, regardless of the names rank, has the data quality indicator Record Credibility Rating denoting whether it has undergone internal review. Because the NODC records originally imported into the ITIS database were of unknown quality, each was assigned a Record Credibility Rating of unverified. As these records are reviewed, credibility ratings are changed to either the highest value, verified standards met, or verified minimum standards met. A rating of verified standards met indicates to the user that all elements in the record and the position of the scientific name in the hierarchy are perceived to be accurate and supported by one or more credible references. If data in the record have been reviewed but are incomplete and/or contain accuracy, placement, or nomenclatural issues, or are from a non-peer reviewed source, a rating of verified minimum standards met is assigned. During the process of adding new names to ITIS, some of the unverified legacy records in the same taxonomic group are vetted (i.e., unverified records are verified and the Record Credibility Rating is adjusted). As a result, more than 53% of the original NODC records have now been verified, and efforts to improve the quality of ITIS legacy data continue.

Latest Record Review, a second ITIS data quality indicator, is assigned to records with names at ranks above species (e.g., genera, families, orders), and represents the year that the record was last reviewed by ITIS. For example, if a family name is listed as verified standards met, with a Latest Record Review rating of 1997, a user can assume that the record was reviewed in that year. Users should be aware however, that taxonomic changes might have been published since that review and not yet incorporated into the ITIS files. (Additionally, users can refer to dates of cited publications which provide another indication of the currency of the record.) For original NODC data, all records were initially rated as unknown for this data quality indicator, but are being adjusted as records are reviewed.

The third ITIS data quality indicator, Global Species Completeness, indicates whether or not, for a given valid/accepted name (i.e., current standing) for a taxon at the rank of genus or higher, all known valid/accepted species for that rank were incorporated into ITIS at the time of review. Completeness ratings of unknown (such as were given to the original NODC data records) are adjusted to the appropriate level, partial or complete, when adequate information supports a change. Both Latest Record Review and Global Species Completeness indicators also are used by ITIS in making decisions about the timeliness of peer review of a group.

There are two major web interfaces to the ITIS data set:

The Canadian partner-developed websites have somewhat different searching and reporting functions from the main (U.S.) site, and link to additional resources (specimen mapping, etc.) The Canadian dataset is synchronized once a month (usually), whereas the main ITIS database is updated as chunks of data pass through an editing and proofing process.

A useful feature for reviewing the status and level of verification of taxonomic groups in ITIS is available via the Canadian partners websites which make use of color-coding to represent the Record Credibility Rating for given name records. When a name is brought up in a report, if that name is valid/accepted (i.e., current standing) and contains any other taxa (e.g., a valid family with genera in ITIS), a link is provided to a table showing the number of valid/accepted subordinate taxa, broken down by rank and level of review. Thus, one can look up an order, follow this link, and see these tabulated numbers. For example:

http://www.cbif.gc.ca/pls/itisca/taxachild?p_tsn=78810&p_ifx=plglt&p_lang=

Additionally, each category in this table (if it is less than some maximum that will not place a performance strain on the system) has a link, so that a user can see an alphabetical listing of, for example, all valid species of chitons in ITIS, or just the unverified species, via the links on the tabulated page. For example:

http://www.cbif.gc.ca/pls/itisca/taxachild?p_tsn=78810&p_value=220&p_ifx=plglt&p_lang=

Such reports on any valid/accepted name in ITIS that contains other taxa can be found by looking up the name via the Canadian websites (click on the name of interest if you get a list of multiple matches instead of a single full report), then scrolling down to the Taxonomic Hierarchy... up to Kingdom box. Just below the name in question is a contains ## valid taxa (## verified), and that line has a link to the tabulated report, which has links to see the list of names in each category.

The degree of review and completion in ITIS varies greatly by group. Some groups have had relatively little review at the genus and species level (e.g., Pseudoscorpionida), and others have had more (e.g., Sepiolida). Some taxonomic groups have expanded in coverage by the addition of regional species or lists, but without review of NODC-derived names in the same group; others have had complete world lists added, coupled with complete reconciliation of the unverified legacy records. A summary of updated data and work in progress is available at the main ITIS websites Data Status Page.

ITIS is making steady progress both in expanding coverage and improving data quality. We welcome offers of assistance from the taxonomic community as we continue to improve the database that, by necessity, still contains a large amount of unverified legacy data. Over the next few years we hope to add additional data development personnel and expand cooperation with specialists. While we are currently expanding the database group-by-group and at the request of users and partners, we hope to develop technological solutions that will facilitate and accelerate this process, as well as expedite the task of improving the quality of legacy records. We continue to consider changes that will enrich the display and use of data at the ITIS websites and promote better understanding of data quality.



[1] For an account of the history of the NODC Taxonomic Code, see http://www.nodc.noaa.gov/General/CDR-detdesc/taxonomic-v8.html.

[2] Counts as of Jul 2014

[3] For complete definitions of ITIS data quality indicators see Glossary at http://www.itis.gov/glossary.html

spacer image
spacing image spacing image spacing image spacing image
A gray bar