Title: | Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants |
---|---|
Description: | A companion to the World Checklist of Vascular Plants (WCVP). It includes functions to generate maps and species lists, as well as match names to the WCVP. For more details and to cite the package, see: Brown M.J.M., Walker B.E., Black N., Govaerts R., Ondo I., Turner R., Nic Lughadha E. (in press). "rWCVP: A companion R package to the World Checklist of Vascular Plants". New Phytologist. |
Authors: | Matilda Brown [aut, cre], Barnaby Walker [aut] |
Maintainer: | Matilda Brown <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.2.6 |
Built: | 2024-11-23 04:35:11 UTC |
Source: | https://github.com/matildabrown/rwcvp |
Get area description from vector of area codes
get_area_name(area_codes)
get_area_name(area_codes)
area_codes |
Character vector containing the set of codes to be mapped to a name. |
Usually used as an inverse function for get_wgsrpd3_codes
. Useful for condensing sets of codes for e.g. file names, plotting and table formatting.
Character. Either a vector of length one, with a name for the set of Level 3 areas, or (if no name exists for that set of areas) the input vector of codes.
get_area_name(get_wgsrpd3_codes("Brazil"))
get_area_name(get_wgsrpd3_codes("Brazil"))
Extract WGSRPD Level 3 (area) codes.
get_wgsrpd3_codes(geography, include_equatorial = NULL)
get_wgsrpd3_codes(geography, include_equatorial = NULL)
geography |
Character. The geography to convert into Level 3 codes. May be a WGSRPD area (Level 3), region (Level 2) or continent (Level 1), country (political) or hemisphere ("Northern Hemisphere", "Southern Hemisphere" or "Equatorial") |
include_equatorial |
Logical. Include Level 3 areas that span the equator? Defaults to |
Country mapping follows Gallagher et al. (2020). Importantly, this means that some overseas territories are not considered part of the country in this system, e.g. the Canary Islands are designated as their own Level 3 area, rather than part of Spain in this mapping. Where this is ambiguous, the mapping can be explored using View(wgsrpd_mapping)
.
Gallagher, R. V., Allen, S., Rivers, M. C., Allen, A. P., Butt, N., Keith, D., & Adams, V. M. (2020). Global shortfalls in extinction risk assessments for endemic flora. bioRxiv, 2020.2003.2012.984559. https://doi.org/10.1101/2020.03.12.984559
Character with area codes (Level 3) that fall within the geography.
get_wgsrpd3_codes("Brazil")
get_wgsrpd3_codes("Brazil")
Plot a POWO style map for given range and range centroids.
powo_map(range_sf, centroids_sf)
powo_map(range_sf, centroids_sf)
range_sf |
A simple features ( |
centroids_sf |
A simple features ( |
A ggplot map of the range
Range maps displayed on the POWO website have a fixed, discrete colour palette based on the type of taxon occurrence in a region.
powo_pal() scale_color_powo(...) scale_colour_powo(...) scale_fill_powo(...)
powo_pal() scale_color_powo(...) scale_colour_powo(...) scale_fill_powo(...)
... |
Arguments passed on to
|
Character. Vector of names and HEX values to match those of POWO.
A dataset containing 20 sampled Red List assessments for name-matching
redlist_example
redlist_example
A data frame with 20 rows and 4 variables:
Red List identifier
Taxon name.
Red List threat category
Taxon author/s.
Downloaded and sampled from https://www.iucnredlist.org/
A dataset containing the higher classification (Angiosperms, Gymnosperms, Ferns and Lycophytes) and Order for each family in the WCVP.
taxonomic_mapping
taxonomic_mapping
A data frame with 457 rows and 3 variables: family, order
and higher_classification
Fern and lycophyte taxonomy from PPG I (2016; doi:10.1111/jse.12229). Angiosperm taxonomy from APG IV (2016; doi:10.1111/boj.12385). Gymnosperm taxonomy from Forest et al (2018; doi:10.1038/s41598-018-24365-4)
Generate a species checklist from WCVP
wcvp_checklist( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, synonyms = TRUE, render_report = FALSE, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, hybrids = FALSE, infraspecies = TRUE, report_filename = NULL, report_dir = NULL, report_type = c("alphabetical", "taxonomic"), wcvp_names = NULL, wcvp_distributions = NULL )
wcvp_checklist( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, synonyms = TRUE, render_report = FALSE, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, hybrids = FALSE, infraspecies = TRUE, report_filename = NULL, report_dir = NULL, report_type = c("alphabetical", "taxonomic"), wcvp_names = NULL, wcvp_distributions = NULL )
taxon |
Character. Taxon to be included. Defaults to NULL (no taxonomic filter; all taxa). |
taxon_rank |
Character. One of "species", "genus", "family", "order" or "higher", giving the rank of the value/s in |
area_codes |
Character. One or many WGSPRD level 3 region codes. Defaults to |
synonyms |
Logical. Include synonyms in checklist (see Details)? Defaults to |
render_report |
Logical. Render the checklist as a markdown report? Defaults to |
native |
Logical. Include species occurrences not flagged as introduced, extinct or doubtful? Defaults to |
introduced |
Logical. Include species occurrences flagged as introduced? Defaults to |
extinct |
Logical. Include species occurrences flagged as extinct? Defaults to |
location_doubtful |
Logical. Include species occurrences flagged as |
hybrids |
Logical. Include hybrid species in checklist? Defaults to |
infraspecies |
Logical. Include hybrid species in checklist? Defaults to |
report_filename |
Character. Name for the HTML file. Defaults to taxon_area_type.html |
report_dir |
Character. Directory for the HTML file to be saved in. Must be provided by user. |
report_type |
Character; one of "alphabetical" (the default) or "taxonomic". Should the generated checklist be sorted alphabetically, or by taxonomic status? |
wcvp_names |
A data frame of taxonomic names from WCVP version 7 or later.
If |
wcvp_distributions |
A data frame of distributions from WCVP version 7 or later.
If |
The synonyms
argument can be used to limit names to those that are Accepted. If synonyms = TRUE
then invalid, illegitimate and other non-accepted names are also included (i.e., the checklist is not limited to names for which taxon_status == "Synonym"
).
Two styles of checklist are supported in rWCVP
- alphabetical and taxonomic.
In an alphabetical checklist, all names are arranged alphabetically with accepted names in bold, and synonyms are followed by their accepted name.
For a taxonomic checklist, names are grouped by their accepted names, and synonyms are listed beneath. Both types of checklist include author, publication and distribution information, though note that family headings are only supported in alphabetical checklists (due to the additional grouping requirement of the taxonomic format).
Data frame with filtered data and, if render_report=TRUE
. a report HTML file.
# These examples take >10 seconds to run and require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_checklist(taxon = "Myrtaceae", taxon_rank = "family", area = get_wgsrpd3_codes("Brazil")) wcvp_checklist(taxon = "Ferns", taxon_rank = "higher", area = get_wgsrpd3_codes("New Zealand")) %>% head() }
# These examples take >10 seconds to run and require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_checklist(taxon = "Myrtaceae", taxon_rank = "family", area = get_wgsrpd3_codes("Brazil")) wcvp_checklist(taxon = "Ferns", taxon_rank = "higher", area = get_wgsrpd3_codes("New Zealand")) %>% head() }
Generate spatial distribution objects for species, genera or families
wcvp_distribution( taxon, taxon_rank = c("species", "genus", "family", "order", "higher"), native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, wcvp_names = NULL, wcvp_distributions = NULL )
wcvp_distribution( taxon, taxon_rank = c("species", "genus", "family", "order", "higher"), native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, wcvp_names = NULL, wcvp_distributions = NULL )
taxon |
Character. The taxon to be mapped. Must be provided. |
taxon_rank |
Character. One of "species", "genus", "family", "order" or "higher", giving the rank of the value in |
native |
Logical. Include native range? Defaults to |
introduced |
Logical. Include introduced range? Defaults to |
extinct |
Logical. Include extinct range? Defaults to |
location_doubtful |
Logical. Include occurrences that are thought to be
doubtful? Defaults to |
wcvp_names |
A data frame of taxonomic names from WCVP version 7 or later.
If |
wcvp_distributions |
A data frame of distributions from WCVP version 7 or later.
If |
Where taxon_rank
is higher than species, the distribution of the whole
group will be returned, not individual species within that group. This also applies when
toggling options - for example, introduced occurrences will only be included if they are
outside the native range, regardless of whether native=TRUE
or native=FALSE
.
To identify extinctions, introductions or doubtful occurrences within the native range,
the wcvp_summary
and wcvp_occ_mat
functions can be used.
Simple features (sf
) data frame containing the range polygon/s of the taxon.
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ r <- wcvp_distribution("Callitris", taxon_rank = "genus") p <- wcvp_distribution_map(r) p }
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ r <- wcvp_distribution("Callitris", taxon_rank = "genus") p <- wcvp_distribution_map(r) p }
Plot distribution maps for species, genera or families
wcvp_distribution_map( range, crop_map = FALSE, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE )
wcvp_distribution_map( range, crop_map = FALSE, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE )
range |
Simple features ( |
crop_map |
Logical. Crop map extent to distribution? Defaults to |
native |
Logical. Include native range? Defaults to |
introduced |
Logical. Include introduced range? Defaults to |
extinct |
Logical. Include extinct range? Defaults to |
location_doubtful |
Logical. Include occurrences that are thought to be
doubtful? Defaults to |
The colour scheme mirrors that used by Plants of the World (POWO; https://powo.science.kew.org/), where green is native, purple is introduced, red is extinct and orange is doubtful. See Examples for how to use custom colours.
A ggplot2::ggplot
of the distribution.
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus")) p # now only the native range, and cropped to range extent p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus"), introduced = FALSE, crop_map = TRUE ) p # now with different colours # note that this taxon only has native and introduced occurrences, so only two colours are needed p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus")) p + # for polygons ggplot2::scale_fill_manual(values = c("red", "blue")) + # for points (islands) ggplot2::scale_colour_manual(values = c("red", "blue")) }
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus")) p # now only the native range, and cropped to range extent p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus"), introduced = FALSE, crop_map = TRUE ) p # now with different colours # note that this taxon only has native and introduced occurrences, so only two colours are needed p <- wcvp_distribution_map(wcvp_distribution("Callitris", taxon_rank = "genus")) p + # for polygons ggplot2::scale_fill_manual(values = c("red", "blue")) + # for points (islands) ggplot2::scale_colour_manual(values = c("red", "blue")) }
Exact matching of names to the WCVP, optionally using the author string to refine results.
wcvp_match_exact(names_df, wcvp_names, name_col, author_col = NULL, id_col)
wcvp_match_exact(names_df, wcvp_names, name_col, author_col = NULL, id_col)
names_df |
Data frame of names for matching. |
wcvp_names |
Data frame of taxonomic names from WCVP version 7 or later.
If |
name_col |
Character. The column in |
author_col |
the column in |
id_col |
the column in |
Match results from WCVP bound to the original data from names_df
.
Other name matching functions:
wcvp_match_fuzzy()
,
wcvp_match_names()
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names # including author string wcvp_match_exact(redlist_example, wcvp_names, "scientificName", author_col = "authority", id_col = "assessmentId" ) # without author string wcvp_match_exact(redlist_example, wcvp_names, "scientificName", id_col = "assessmentId") }
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names # including author string wcvp_match_exact(redlist_example, wcvp_names, "scientificName", author_col = "authority", id_col = "assessmentId" ) # without author string wcvp_match_exact(redlist_example, wcvp_names, "scientificName", id_col = "assessmentId") }
Fuzzy matching to names in the WCVP using phonetic matching and edit distance. The WCVP can be loaded for matching from rWCVPdata::wcvp_names.
wcvp_match_fuzzy(names_df, wcvp_names, name_col, progress_bar = TRUE) phonetic_match(names_df, wcvp_names, name_col) edit_match(names_df, wcvp_names, name_col)
wcvp_match_fuzzy(names_df, wcvp_names, name_col, progress_bar = TRUE) phonetic_match(names_df, wcvp_names, name_col) edit_match(names_df, wcvp_names, name_col)
names_df |
Data frame of names for matching. |
wcvp_names |
Data frame of taxonomic names from WCVP version 7 or later.
If |
name_col |
Character. The column in |
progress_bar |
Logical. Show progress bar when matching? Defaults to
|
The wcvp_match_fuzzy
function uses phonetic matching first and then finds
the closest match based on edit distance for any remaining names.
Phonetic matching uses phonics::metaphone encoding with a maximum code length of 20.
Edit distance matching finds the closest match based on Levenshtein similarity, calculated by RecordLinkage::levenshteinSim.
Match results from WCVP bound to the original data from names_df
.
Other name matching functions:
wcvp_match_exact()
,
wcvp_match_names()
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names wcvp_match_fuzzy(redlist_example, wcvp_names, "scientificName") } # this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names phonetic_match(redlist_example, wcvp_names, "scientificName") } # this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names edit_match(redlist_example, wcvp_names, "scientificName") }
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names wcvp_match_fuzzy(redlist_example, wcvp_names, "scientificName") } # this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names phonetic_match(redlist_example, wcvp_names, "scientificName") } # this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names edit_match(redlist_example, wcvp_names, "scientificName") }
Match names to WCVP, first using exact matching and then using fuzzy matching on any remaining unmatched names.
wcvp_match_names( names_df, wcvp_names = NULL, name_col = NULL, id_col = NULL, author_col = NULL, join_cols = NULL, fuzzy = TRUE, progress_bar = TRUE )
wcvp_match_names( names_df, wcvp_names = NULL, name_col = NULL, id_col = NULL, author_col = NULL, join_cols = NULL, fuzzy = TRUE, progress_bar = TRUE )
names_df |
Data frame of names for matching. |
wcvp_names |
Data frame of taxonomic names from WCVP version 7 or later.
If |
name_col |
Character. The column in |
id_col |
Character. A column in |
author_col |
the column in |
join_cols |
Character. A vector of name parts to make the taxon name,
if |
fuzzy |
Logical; whether or not fuzzy matching should be used for names that could not be matched exactly. |
progress_bar |
Logical. Show progress bar when matching? Defaults to
|
By default, exact matching uses only the taxon name (supplied by name_col
)
unless a column specifying the author string is provided (as author_col
).
Columns setting out name parts can be supplied as join_cols
in place of a
taxon name, but must be supplied in the order you want them joined
(e.g. c("genus", "species", "infra_rank", "infra")
).
Fuzzy matching uses a combination of phonetic and edit distance matching,
and can optionally be turned off using fuzzy=FALSE
.
The WCVP can be loaded for matching from rWCVPdata::wcvp_names
.
See here for an example workflow.
Match results from WCVP bound to the original data from names_df
.
Other name matching functions:
wcvp_match_exact()
,
wcvp_match_fuzzy()
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names # without author wcvp_match_names(redlist_example, wcvp_names, name_col = "scientificName", id_col = "assessmentId" ) # with author wcvp_match_names(redlist_example, wcvp_names, name_col = "scientificName", id_col = "assessmentId", author_col = "authority" ) }
# these examples require 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_names <- rWCVPdata::wcvp_names # without author wcvp_match_names(redlist_example, wcvp_names, name_col = "scientificName", id_col = "assessmentId" ) # with author wcvp_match_names(redlist_example, wcvp_names, name_col = "scientificName", id_col = "assessmentId", author_col = "authority" ) }
Generate occurrence matrix for taxa and areas
wcvp_occ_mat( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, wcvp_names = NULL, wcvp_distributions = NULL )
wcvp_occ_mat( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, native = TRUE, introduced = TRUE, extinct = TRUE, location_doubtful = TRUE, wcvp_names = NULL, wcvp_distributions = NULL )
taxon |
Character. One or many taxa to be included. Defaults to NULL (all species) |
taxon_rank |
Character. One of "species", "genus", "family", "order" or "higher", giving the rank of the value/s in |
area_codes |
Character. One or many WGSPRD level 3 region codes. Defaults to |
native |
Logical. Include species occurrences not flagged as introduced, extinct or doubtful? Defaults to TRUE. |
introduced |
Logical. Include species occurrences flagged as introduced? Defaults to TRUE. |
extinct |
Logical. Include species occurrences flagged as extinct? Defaults to TRUE. |
location_doubtful |
Logical. Include species occurrences flagged as location doubtful? Defaults to TRUE. |
wcvp_names |
A data frame of taxonomic names from WCVP version 7 or later.
If |
wcvp_distributions |
A data frame of distributions from WCVP version 7 or later.
If |
See here for an example of how this output can be formatted for publication.
A data.frame containing the taxon_name
and plant_name_id
of all species that are present in the area
, plus one variable for each WGSPRD level 3 region in area
, with species presences marked as 1 and absences marked as 0.
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_occ_mat( taxon = "Poa", taxon_rank = "genus", area = c("TAS", "VIC", "NSW"), introduced = FALSE ) }
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ wcvp_occ_mat( taxon = "Poa", taxon_rank = "genus", area = c("TAS", "VIC", "NSW"), introduced = FALSE ) }
Reformat local versions of WCVP
wcvp_reformat(wcvp_local, version = NULL)
wcvp_reformat(wcvp_local, version = NULL)
wcvp_local |
Data.frame. Local copy of the WCVP. |
version |
Either 9 or "v9". We will add support for other versions as needed. |
Note that not all of the original variables are preserved during reformatting.
For example, publication is a single variable in v9, but split over multiple in
the data package. It is therefore not possible to simply rename this variable.
Variables that are present in the data package but not in v9 are filled with NA
.
A data.frame with the same variable structure as the WCVP that is
included in the data package rWCVPdata
.
Generate a summary table from the WCVP
wcvp_summary( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, grouping_var = c("area_code_l3", "genus", "family", "order", "higher"), hybrids = FALSE, wcvp_names = NULL, wcvp_distributions = NULL )
wcvp_summary( taxon = NULL, taxon_rank = c("species", "genus", "family", "order", "higher"), area_codes = NULL, grouping_var = c("area_code_l3", "genus", "family", "order", "higher"), hybrids = FALSE, wcvp_names = NULL, wcvp_distributions = NULL )
taxon |
Character. Taxon to be included. Defaults to NULL (no taxonomic filter; all taxa). |
taxon_rank |
Character. One of "genus", "family", "order" or "higher", giving the rank of the value/s in |
area_codes |
Character. One or many WGSPRD level 3 region codes. Defaults to |
grouping_var |
Character; one of |
hybrids |
Logical. Include hybrid species in counts? Defaults to FALSE. |
wcvp_names |
A data frame of taxonomic names from WCVP version 7 or later.
If |
wcvp_distributions |
A data frame of distributions from WCVP version 7 or later.
If |
Valid values for rank 'higher' are 'Angiosperms', 'Gymnosperms', 'Ferns' and 'Lycophytes'.
Note that grouping variable (if taxonomic) should be of a lower level than taxon
and taxon_rank
to produce a meaningful summary (i.e., it does not make sense to group a genus by genus, family or higher classification).
Additionally, if the grouping variable is taxonomic then species occurrences are aggregated across the input area. This means that if a species is native to any of the input area (even if it is introduced or extinct in other parts) it is counted as 'Native'. Similarly, introduced occurrences take precedence over extinct occurrences. Note that in this type of summary table, 'Endemic' means endemic to the input area, not necessarily to a single WGSRPD Level 3 Area within the input area.
Data.frame with filtered data, or a gt
table
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ ferns <- wcvp_summary("Ferns", "higher", get_wgsrpd3_codes("New Zealand"), grouping_var = "family") wcvp_summary_gt(ferns) }
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ ferns <- wcvp_summary("Ferns", "higher", get_wgsrpd3_codes("New Zealand"), grouping_var = "family") wcvp_summary_gt(ferns) }
Render a summary table from wcvp_summary
wcvp_summary_gt(x)
wcvp_summary_gt(x)
x |
List. |
gt table
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ ferns <- wcvp_summary("Ferns", "higher", get_wgsrpd3_codes("New Zealand"), grouping_var = "family") wcvp_summary_gt(ferns) }
# this example requires 'rWCVPdata' if(requireNamespace("rWCVPdata")){ ferns <- wcvp_summary("Ferns", "higher", get_wgsrpd3_codes("New Zealand"), grouping_var = "family") wcvp_summary_gt(ferns) }
A dataset containing the area (Level 3), #' region (Level 2), continent (Level 1), country (political) and hemisphere category for each Level 3 area. Country mapping follows Gallagher et al. (2020).
wgsrpd_mapping
wgsrpd_mapping
A data frame with 370 rows and 7 variables:
Northern, Southern or Equatorial (spanning the equator).
Continent code.
Continent.
Region code.
Region.
Country (political; from Gallagher et al., 2020)
Area code.
Area.
Modified from data available at https://github.com/tdwg/wgsrpd
Spatial data for WGSRPD Level 3, for plotting maps
wgsrpd3
wgsrpd3
An 'sf' object with 20 rows and 4 variables:
Region name
Region code
Level 2 code
Level 1 code (continent)
sf geometry
Used for mapping.
https://github.com/tdwg/wgsrpd/tree/master/level3