Title: | Tools for Summarising and Analysing Soundscape Data |
---|---|
Description: | A variety of tools relevant to the analysis of marine soundscape data. There are tools for downloading AIS (automatic identification system) data from Marine Cadastre <https://hub.marinecadastre.gov>, connecting AIS data to GPS coordinates, plotting summaries of various soundscape measurements, and downloading relevant environmental variables (wind, swell height) from the National Center for Atmospheric Research data server <https://rda.ucar.edu/datasets/ds084.1/>. Most tools were developed to work well with output from 'Triton' software, but can be adapted to work with any similar measurements. |
Authors: | Taiki Sakai [aut, cre], Anne Simonis [ctb], Shannon Rankin [ctb], Megan McKenna [ctb], Kaitlin Palmer [ctb] |
Maintainer: | Taiki Sakai <[email protected]> |
License: | GNU General Public License |
Version: | 0.10.1 |
Built: | 2025-02-12 19:32:42 UTC |
Source: | https://github.com/taikisan21/pamscapes |
Adds matching AIS data downloaded from Marine Cadastre to a dataframe containing location information
addAIS( x, ais, interpType = c("all", "close", "none"), interpTime = 0, interpCols = NULL )
addAIS( x, ais, interpType = c("all", "close", "none"), interpTime = 0, interpCols = NULL )
x |
a dataframe with |
ais |
AIS data created using the readLocalAIS function |
interpType |
one of |
interpTime |
time (seconds) between new |
interpCols |
names of any extra columns to interpolate (other than
|
a dataframe with AIS data added, will contain more rows than x
if ais
has more than one vessel. If any interpolation is applied,
any non-constant columns not specified to interpCols
will be removed
Taiki Sakai [email protected]
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, aisDir=system.file('extdata/ais', package='PAMscapes'), distance=20e3) gpsNoInterp <- addAIS(gps, ais, interpType='none') str(gpsNoInterp) gpsClose <- addAIS(gps, ais, interpType='close') str(gpsClose) gpsAllInterp <- addAIS(gps, ais, interpType='all') str(gpsAllInterp)
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, aisDir=system.file('extdata/ais', package='PAMscapes'), distance=20e3) gpsNoInterp <- addAIS(gps, ais, interpType='none') str(gpsNoInterp) gpsClose <- addAIS(gps, ais, interpType='close') str(gpsClose) gpsAllInterp <- addAIS(gps, ais, interpType='all') str(gpsAllInterp)
Adds a summary of matching AIS data for nearby vessels to a data. Information added includes number of vessels, distance to nearby vessels, and average speed of nearby vessels
addAISSummary(x, ais, distance = 10000)
addAISSummary(x, ais, distance = 10000)
x |
a dataframe with |
ais |
AIS data created using the readLocalAIS function. Can also be a character listing the directory of AIS |
distance |
distance (meters) within locations in |
a dataframe with AIS summary data added. Will contain new columns
the number of ships within "distance" at this time
average distance of nearby ships, NA if none
average speed over ground of nearby ships, NA if none
distance of the closest ship, NA if none
speed over ground of closest ship, NA if none
Taiki Sakai [email protected]
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, system.file('extdata/ais', package='PAMscapes')) aisSummary <- addAISSummary(gps, ais) str(aisSummary)
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, system.file('extdata/ais', package='PAMscapes')) aisSummary <- addAISSummary(gps, ais) str(aisSummary)
Transforms detection data to presence-type data with user specified time bin (e.g. hourly or daily presence).
binDetectionData( x, bin, columns = c("species", "project"), rematchGPS = TRUE, gpsGroup = NULL )
binDetectionData( x, bin, columns = c("species", "project"), rematchGPS = TRUE, gpsGroup = NULL )
x |
dataframe of deteciton data |
bin |
the amount time to bin by, must be a character of the form
|
columns |
names of the columns in |
rematchGPS |
logical flag, if |
gpsGroup |
the name of the column in |
a dataframe where each row represents detection presence of one time unit
Taiki Sakai [email protected]
dets <- data.frame( UTC = as.POSIXct(c('2020-04-04 12:20:00', '2020-04-04 12:40:00', '2020-04-04 13:20:00')), species = c('whale', 'whale', 'dolphin'), call = c('a', 'b', 'c')) # two rows of outputs binDetectionData(dets, bin='1hour', columns='species') # adding "call" creates 3 rows of outputs binDetectionData(dets, bin='1hour', columns=c('species', 'call'))
dets <- data.frame( UTC = as.POSIXct(c('2020-04-04 12:20:00', '2020-04-04 12:40:00', '2020-04-04 13:20:00')), species = c('whale', 'whale', 'dolphin'), call = c('a', 'b', 'c')) # two rows of outputs binDetectionData(dets, bin='1hour', columns='species') # adding "call" creates 3 rows of outputs binDetectionData(dets, bin='1hour', columns=c('species', 'call'))
Bins soundscape measurements by a unit of time and summarises them using a function (usually the median)
binSoundscapeData( x, bin = "1hour", method = c("median", "mean"), binCount = FALSE, extraCols = NULL )
binSoundscapeData( x, bin = "1hour", method = c("median", "mean"), binCount = FALSE, extraCols = NULL )
x |
a data.frame of soundscape metric data read in with loadSoundscapeData |
bin |
amount of time to bin data by, format can
be "#Unit" e.g. |
method |
summary function to apply to data in each time bin, must be one of "median" or "mean" |
binCount |
logical flag to return the number of times in each time bin as column "binCount" |
extraCols |
Additional non-frequency columns in |
a summarised version of the input data.frame x
Taiki Sakai [email protected]
Reads and checks data to ensure formatting will work
for other PAMscapes
functions. Will read and check the
formatting of CSV files, or check the formatting of dataframes.
Can also read in MANTA NetCDF files and format the data
appropriately.
checkSoundscapeInput( x, needCols = c("UTC"), skipCheck = FALSE, timeBin = NULL, binFunction = median, octave = c("original", "tol", "ol"), label = NULL, tz = "UTC", extension = c("nc", "csv") )
checkSoundscapeInput( x, needCols = c("UTC"), skipCheck = FALSE, timeBin = NULL, binFunction = median, octave = c("original", "tol", "ol"), label = NULL, tz = "UTC", extension = c("nc", "csv") )
x |
a dataframe, path to a CSV file, or path to a MANTA
NetCDF file, or folder containing these. If |
needCols |
names of columns that must be present in |
skipCheck |
logical flag to skip some data checking, recommended
to keep as |
timeBin |
amount of time to bin data by, format can
be "#Unit" e.g. |
binFunction |
summary function to apply to data in each time bin |
octave |
one of "original", "tol", or "ol". If "original" then nothing happens, otherwise data are converted to Octave-leve ("ol") or Third-Octave-Level ("tol") measurements using createOctaveLevel |
label |
optional, if not |
tz |
timezone of the data being loaded, will be converted to UTC after load |
extension |
only used if |
Files created by MANTA and Triton software will be reformatted to have consisitent formatting. The first column will be renamed to "UTC", and columns containing soundscape metrics will be named using the convention "TYPE_FREQUENCY", e.g. "HMD_1", "HMD_2" for Manta hybrid millidecade mesaurements.
Inputs from sources other than MANTA or Triton can be accepted in either "wide" or "long" format. Wide format must follow the conventions above - first column "UTC", other columns named by "TYPE_FREQUENCY" where TYPE is consistent across all columns and FREQUENCY is in Hertz. Long format data must have the following columns:
- time of the measurement, in UTC timezone
- the type of soundscape measurement e.g. PSD or OL, must be the same for all
- the frequency of the measurement, in Hertz
- the soundscape measurement value, usually dB
a dataframe
Taiki Sakai [email protected]
manta <- checkSoundscapeInput(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) str(manta) ol <- checkSoundscapeInput(system.file('extdata/OLSmall.csv', package='PAMscapes')) str(ol) psd <- checkSoundscapeInput(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd)
manta <- checkSoundscapeInput(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) str(manta) ol <- checkSoundscapeInput(system.file('extdata/OLSmall.csv', package='PAMscapes')) str(ol) psd <- checkSoundscapeInput(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd)
Creates octave or third octave level measurements from finer resolution soundscape metrics, like Power Spectral Density (PSD) or Hybrid Millidecade (HMD) measures
createOctaveLevel( x, type = c("ol", "tol"), freqRange = NULL, normalized = FALSE )
createOctaveLevel( x, type = c("ol", "tol"), freqRange = NULL, normalized = FALSE )
x |
dataframe of soundscape metrics |
type |
either |
freqRange |
a vector of the minimum and maximum center frequencies (Hz) desired
for the output. If |
normalized |
logical flag to return values normalized by the bandwidth of each octave level band |
Note that these measures are not as precise as they could be, mostly meant to be used for visualizations. Bands of the original data that do not fit entirely within a single octave band are not proportionately split between the two proper output bands. Instead an output band will contain all inputs where the center frequency falls between the limits of the output band. For higher frequencies this should result in negligible differences, but lower frequencies will be more imprecise.
a dataframe with summarised octave level band measurements
Taiki Sakai [email protected]
psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd) tol <- createOctaveLevel(psd, type='tol') str(tol) ol <- createOctaveLevel(tol, type='ol') str(ol)
psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd) tol <- createOctaveLevel(psd, type='tol') str(tol) ol <- createOctaveLevel(tol, type='ol') str(ol)
Downloads daily AIS files from https://hub.marinecadastre.gov/pages/vesseltraffic covering the date range present in input data
downloadMarCadAIS(x, outDir, overwrite = FALSE, unzip = TRUE, verbose = TRUE)
downloadMarCadAIS(x, outDir, overwrite = FALSE, unzip = TRUE, verbose = TRUE)
x |
a dataframe with column |
outDir |
directory to save the downloaded files |
overwrite |
logical flag to overwrite existing data. Recommended
to be |
unzip |
logical flag to unzip downloaded files. Original downloads from Marine Cadastre come as large .zip |
verbose |
logical flag to print messages about download progress |
a vector of the paths to the downloaded .zip files, any days
that were unable to download will be NA
Taiki Sakai [email protected]
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) tempDir <- tempdir() # Commented out because running this will download # a ~500mb file # marcadFiles <- downloadMarCadAIS(gps, outDir=tempDir)
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) tempDir <- tempdir() # Commented out because running this will download # a ~500mb file # marcadFiles <- downloadMarCadAIS(gps, outDir=tempDir)
Format effort data for use in other acoustic detection plotting functions. Time ranges will be marked as either "on" or "off" effort
formatEffort( effort, range = NULL, resolution = NULL, columns = NULL, combineYears = FALSE )
formatEffort( effort, range = NULL, resolution = NULL, columns = NULL, combineYears = FALSE )
effort |
dataframe with columns |
range |
if not |
resolution |
if not |
columns |
if not |
combineYears |
logical flag to combine all years into a single "year" |
a dataframe with columns start
, end
, and status
which is either "on" or "off", as well as any columns listed in columns
Taiki Sakai [email protected]
Loads and formats detection data into a common format for use in other PAMscapes functions
loadDetectionData( x, source = c("csv", "makara"), columnMap = NULL, detectionType = c("auto", "presence", "detection"), presenceDuration = NULL, dateFormat = c("%Y-%m-%dT%H:%M:%S+0000", "%Y-%m-%d %H:%M:%S", "%m-%d-%Y %H:%M:%S", "%Y/%m/%d %H:%M:%S", "%m/%d/%Y %H:%M:%S"), tz = "UTC", wide = FALSE, speciesCols = NULL, detectedValues = NULL, extraCols = NULL, ... )
loadDetectionData( x, source = c("csv", "makara"), columnMap = NULL, detectionType = c("auto", "presence", "detection"), presenceDuration = NULL, dateFormat = c("%Y-%m-%dT%H:%M:%S+0000", "%Y-%m-%d %H:%M:%S", "%m-%d-%Y %H:%M:%S", "%Y/%m/%d %H:%M:%S", "%m/%d/%Y %H:%M:%S"), tz = "UTC", wide = FALSE, speciesCols = NULL, detectedValues = NULL, extraCols = NULL, ... )
x |
dataframe or path to CSV file containing detection data |
source |
source of the detection data, choices other than "csv" just specify specific formatting options |
columnMap |
a list or data.frame specifying how to map the input
column names to the required standard names of "UTC", "end", and "species".
If a list, must be a named list where the names are the existing
column names and the values are the standardized names, e.g.
|
detectionType |
one of "auto", "presence", or "detection" specifying the type of detection in the data. "presence" means hourly or daily presence style of detections - the duration of the detection is used for the time unit (e.g. hourly presence might have "UTC" value 2020-01-01 12:00:00 and "end" value 2020-01-01 13:00:00 for a detection). "detection" means the data refer to specific detections or bouts of detections rather than just presence. "auto" means that the type of detection will be inferred from the start and end time of each detection - any detections with a duration of exactly one hour or exactly one day will be marked as "presence", any other duration will be marked as "detection" |
presenceDuration |
if |
dateFormat |
format string of dates, see strptime. Can be a vector of multiple formats |
tz |
time zone of input data |
wide |
logical flag indicating whether the input data has species
detection information in wide (instead of long) format. If |
speciesCols |
only used if |
detectedValues |
only used if |
extraCols |
(optional) any additional columns to keep with the output |
... |
additional arguments used for certain |
a dataframe with columns UTC, end, species, and detectionType, where each row represents a single detection event. May have additional columns depending on other parameters
Taiki Sakai [email protected]
Reads in hybrid millidecade data from a MANTA NetCDF output file and formats it into the dataframe format required for use in other PAMscapes functions
loadMantaNc(x, keepQuals = c(1), keepEffort = TRUE)
loadMantaNc(x, keepQuals = c(1), keepEffort = TRUE)
x |
path to .nc file |
keepQuals |
quality flag values to keep. Accepts vector of
integers from (1, 2, 3, 4) corresponding to flag labels "Good",
"Not evaluated/Unknown", "Compromised/Questionable", and "Unusable/Bad".
HMD levels for points with data quality flags outside of |
keepEffort |
if |
a dataframe with first column UTC and other columns named HMD_Frequency
Taiki Sakai [email protected]
# no sample NetCDF provided (too large) manta <- loadMantaNc('MANTA.nc')
# no sample NetCDF provided (too large) manta <- loadMantaNc('MANTA.nc')
Loads soundscape data just like loadSoundscapeData, but is designed to load multiple soundscape datasets from multiple folders. This is identical to loading each folder of data individually with the same bin and label parameters.
loadMultiscapeData( x, timeBin = NULL, binFunction = "median", binCount = FALSE, octave = c("original", "tol", "ol"), label = NULL, keepQuals = c(1), keepEffort = TRUE, dropNonHmd = TRUE, tz = "UTC", extension = c("nc", "csv") )
loadMultiscapeData( x, timeBin = NULL, binFunction = "median", binCount = FALSE, octave = c("original", "tol", "ol"), label = NULL, keepQuals = c(1), keepEffort = TRUE, dropNonHmd = TRUE, tz = "UTC", extension = c("nc", "csv") )
x |
a vector of folder names to load |
timeBin |
amount of time to bin data by, format can
be "#Unit" e.g. |
binFunction |
summary function to apply to data in each time bin, default is "median" |
binCount |
logical flag to return the number of times in each time bin as column "binCount" |
octave |
one of "original", "tol", or "ol". If "original" then nothing happens, otherwise data are converted to Octave-leve ("ol") or Third-Octave-Level ("tol") measurements using createOctaveLevel |
label |
if not |
keepQuals |
quality flag values to keep. Accepts vector of
integers from (1, 2, 3, 4) corresponding to flag labels "Good",
"Not evaluated/Unknown", "Compromised/Questionable", and "Unusable/Bad".
HMD levels for points with data quality flags outside of |
keepEffort |
if |
dropNonHmd |
logical flag to drop non-standard hybrid millidecade
bands, only applies to HMD type data. Some datasets have frequency
values that are not part of the standard HMD bands (e.g. at exactly
the Nyquist rate), if |
tz |
timezone of the data being loaded, will be converted to UTC after load |
extension |
only required if both netCDF and CSV files exist in the folders to load, in which case only one type will be loaded. Must be one of "nc" or "csv" |
This function is equivalent to loading each folder of data
separately with the same time and octave-level aggregation options
applied, and is meant as a convenient wrapper for loading multiple
years or sites of data for comparison person. The expectation is that
this function will be primarily used for large scale comparisons, hence
why timeBin
is a required argument to reduce data resolution.
The only other difference is that if no labels are supplied for the folders,
then one will be generated either from the names of x
if it is a
named vector, or the name of the folder using basename. This is to
ensure that each separate folder can be identified once read in.
a dataframe
Taiki Sakai [email protected]
Reads and checks data to ensure formatting will work
for other PAMscapes
functions. Will read and check the
formatting of CSV files, or check the formatting of dataframes.
Can also read in MANTA NetCDF files and format the data
appropriately.
loadSoundscapeData( x, needCols = c("UTC"), skipCheck = FALSE, timeBin = NULL, binFunction = "median", binCount = FALSE, octave = c("original", "tol", "ol"), label = NULL, keepQuals = c(1), keepEffort = TRUE, dropNonHmd = TRUE, tz = "UTC", extension = c("nc", "csv") )
loadSoundscapeData( x, needCols = c("UTC"), skipCheck = FALSE, timeBin = NULL, binFunction = "median", binCount = FALSE, octave = c("original", "tol", "ol"), label = NULL, keepQuals = c(1), keepEffort = TRUE, dropNonHmd = TRUE, tz = "UTC", extension = c("nc", "csv") )
x |
a dataframe, path to a CSV file, or path to a MANTA
NetCDF file, or folder containing these. If |
needCols |
names of columns that must be present in |
skipCheck |
logical flag to skip some data checking, recommended
to keep as |
timeBin |
amount of time to bin data by, format can
be "#Unit" e.g. |
binFunction |
summary function to apply to data in each time bin, default is "median" |
binCount |
logical flag to return the number of times in each time bin as column "binCount" |
octave |
one of "original", "tol", or "ol". If "original" then nothing happens, otherwise data are converted to Octave-leve ("ol") or Third-Octave-Level ("tol") measurements using createOctaveLevel |
label |
optional, if not |
keepQuals |
quality flag values to keep. Accepts vector of
integers from (1, 2, 3, 4) corresponding to flag labels "Good",
"Not evaluated/Unknown", "Compromised/Questionable", and "Unusable/Bad".
HMD levels for points with data quality flags outside of |
keepEffort |
if |
dropNonHmd |
logical flag to drop non-standard hybrid millidecade
bands, only applies to HMD type data. Some datasets have frequency
values that are not part of the standard HMD bands (e.g. at exactly
the Nyquist rate), if |
tz |
timezone of the data being loaded, will be converted to UTC after load |
extension |
only used if |
Files created by MANTA and Triton software will be reformatted to have consisitent formatting. The first column will be renamed to "UTC", and columns containing soundscape metrics will be named using the convention "TYPE_FREQUENCY", e.g. "HMD_1", "HMD_2" for Manta hybrid millidecade mesaurements.
Inputs from sources other than MANTA or Triton can be accepted in either "wide" or "long" format. Wide format must follow the conventions above - first column "UTC", other columns named by "TYPE_FREQUENCY" where TYPE is consistent across all columns and FREQUENCY is in Hertz. Long format data must have the following columns:
- time of the measurement, in UTC timezone
- the type of soundscape measurement e.g. PSD or OL, must be the same for all
- the frequency of the measurement, in Hertz
- the soundscape measurement value, usually dB
a dataframe
Taiki Sakai [email protected]
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) str(manta) ol <- loadSoundscapeData(system.file('extdata/OLSmall.csv', package='PAMscapes')) str(ol) psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd)
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) str(manta) ol <- loadSoundscapeData(system.file('extdata/OLSmall.csv', package='PAMscapes')) str(ol) psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) str(psd)
Marks values within a soundscape dataframe as NA according to provided time and (optionally) frequency values
markNA(x, na, by = NULL)
markNA(x, na, by = NULL)
x |
dataframe of soundscape data to mark NAs in |
na |
dataframe listing areas to mark NA. Must have columns |
by |
optional column name in both |
same dataframe as x
but with some values replaced with NA
Taiki Sakai [email protected]
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) naDf <- data.frame(start=min(manta$UTC), end=max(manta$UTC), freqMin=100, freqMax=500) plotHourlyLevel(manta) plotHourlyLevel(markNA(manta, na=naDf))
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) naDf <- data.frame(start=min(manta$UTC), end=max(manta$UTC), freqMin=100, freqMax=500) plotHourlyLevel(manta) plotHourlyLevel(markNA(manta, na=naDf))
Downloads and matches wind and precipitation data from the Global Forecast System (GFS) weather model. Data is downloaded from the National Center for Atmospheric Research data server https://rda.ucar.edu/datasets/ds084.1/. The particular GFS dataset downloaded is the closest "forecast" dataset to the particular time (e.g. .f000 or .f003)
matchGFS(x, progress = TRUE, keepMatch = TRUE)
matchGFS(x, progress = TRUE, keepMatch = TRUE)
x |
a dataframe with columns |
progress |
logical flag to display download progress |
keepMatch |
logical flag to keep the "matchLat", "matchLong", and "matchTime" columns with the output. These are only used to verify which coordinates within the NetCDF were matched to your data. |
a dataframe with wind (m/s) and precipitation rate (kg/m^2/s) columns added:
Eastward wind velocity
Northward wind velocity
Total wind magnitude
Precipitation rate
Cosest latitude coordinate matched in GFS
Closest longitude coordinate matched in GFS
Closest time coordinate matched in GFS
Where the last three columns are only included if keepMatch=TRUE
Taiki Sakai [email protected]
# API response may be slow for this example gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) gps <- matchGFS(gps)
# API response may be slow for this example gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) gps <- matchGFS(gps)
Downloads and matches relevant Seascape class data from the ERDDAP (Environmental Research Division's Data Access Program) server at https://cwcgom.aoml.noaa.gov/erddap/index.html. More information on theclasses can be found on the help page for the seascapeR package https://marinebon.github.io/seascapeR/index.html.
matchSeascape(x, type = c("monthly", "8day"), progress = TRUE)
matchSeascape(x, type = c("monthly", "8day"), progress = TRUE)
x |
a dataframe with columns |
type |
the type of seascape data to download, one of "monthly" or "8day" |
progress |
logical flag whether or not to show download progress |
This function is just a wrapper around matchEnvData pointing to the specific base URL and dataset ID relevant for seascape data
the same dataframe as x
, but with new columns
seascapeClass
and seascapeProb
representing the
"CLASS" and "P" variables from the dataset
Taiki Sakai [email protected]
Plots a representation of the acoustic scene using detections in data. Frequency ranges for detections are taken from user input and displayed as different colored bars
plotAcousticScene( x, freqMap = NULL, typeCol = "species", title = NULL, bin = "1day", by = NULL, combineYears = FALSE, effort = NULL, scale = c("log", "linear"), freqMin = NULL, freqMax = NULL, fill = TRUE, alpha = 1, returnData = FALSE, add = FALSE )
plotAcousticScene( x, freqMap = NULL, typeCol = "species", title = NULL, bin = "1day", by = NULL, combineYears = FALSE, effort = NULL, scale = c("log", "linear"), freqMin = NULL, freqMax = NULL, fill = TRUE, alpha = 1, returnData = FALSE, add = FALSE )
x |
dataframe of detections, must have column |
freqMap |
a dataframe listing frequency ranges to use for
various detection types in |
typeCol |
column name in |
title |
optional title to use for the plot |
bin |
time bin to use for plotting time axis. Each detection will be displayed as covering this amount of time |
by |
if not |
combineYears |
logical flag to combine all observations to display as a single "year". The year will be set to 2019, and detections falling on leap days (February 29th) will be removed |
effort |
if not |
scale |
one of |
freqMin |
optional minimum frequency for plot, useful for log scale |
freqMax |
optional maximum frequency for plot |
fill |
logical flag if |
alpha |
transparency percentage for plotting, values less than 1 will allow multiple overlapping colors to be seen |
returnData |
if |
add |
logical flag if |
a ggplot object
Taiki Sakai [email protected]
detDf <- data.frame( UTC=as.POSIXct(c('2023-01-01 00:00:00', '2023-01-03 00:00:00', '2023-01-02 12:00:00', '2023-01-04 00:00:00'), tz='UTC'), species = c('Dolphin', 'Whale', 'Whale', 'Dolphin')) freqMap <- data.frame(type=c('Dolphin', 'Whale'), freqMin=c(10e3, 100), freqMax=c(30e3, 400), color=c('darkgreen', 'blue')) plotAcousticScene(detDf, freqMap=freqMap, typeCol='species', bin='1day')
detDf <- data.frame( UTC=as.POSIXct(c('2023-01-01 00:00:00', '2023-01-03 00:00:00', '2023-01-02 12:00:00', '2023-01-04 00:00:00'), tz='UTC'), species = c('Dolphin', 'Whale', 'Whale', 'Dolphin')) freqMap <- data.frame(type=c('Dolphin', 'Whale'), freqMin=c(10e3, 100), freqMax=c(30e3, 400), color=c('darkgreen', 'blue')) plotAcousticScene(detDf, freqMap=freqMap, typeCol='species', bin='1day')
Plots time series of boxplots showing detection data across time
plotDetectionBoxplot( x, group = "species", facet = NULL, color = hue_pal(), bin = "day/week", combineYears = FALSE, effort = NULL, dropZeroes = FALSE, returnData = FALSE )
plotDetectionBoxplot( x, group = "species", facet = NULL, color = hue_pal(), bin = "day/week", combineYears = FALSE, effort = NULL, dropZeroes = FALSE, returnData = FALSE )
x |
dataframe of detection data read in with loadDetectionData |
group |
name(s) of columns indicating which rows of |
facet |
if not |
color |
only used if |
bin |
time bins to use for generating plot, must be a character of
format "time1/time2" where "time1" will be the y-axis of the plot and
"time2" will be the x-axis of the plot. Times are one of "hour", "day",
"week", or "month" (e.g. |
combineYears |
logical flag to combine all observations to display as a single "year" |
effort |
if not |
dropZeroes |
logical flag to remove boxplots where all observations are zero (these would normally appear as a flat line at zero) |
returnData |
if |
The combination of group
, facet
, and
combineYears
determine the data points that make up each boxplot.
If combineYears=TRUE
, then there will be a different point for
each year. There will additionally be separate points for each different
value of the columns in group
, excluding the column used for
facet
(since these points are instead split out to different
facetted plots).
For example, if you have data from a single location, then settings of
combineYears=FALSE
, group='species'
, and facet=NULL
will create a plot where each point in a boxplot represents the number
of detections for a species. If you change to facet='species'
,
then the result will show a multi panel plot where each boxplot is just
a single point. Then changing to combineYears=TRUE
will show
a multi panel plot where each point in a boxplot is the number of
detections for that panel's species in different years.
a ggplot object
Taiki Sakai [email protected]
Plots a heatmap of summarised sound levels. Y-axis is hour
of the day, X-axis is frequency bin. Plotted values are the median of
the value
column for each hour/frequency pairing across the dataset.
This function is designed to work with sound level outputs with consistent
frequency bins measured across time
plotHourlyLevel( x, title = NULL, units = NULL, scale = c("log", "linear"), freqMin = NULL, dbRange = NULL, toTz = "UTC", cmap = viridis_pal()(25), returnData = FALSE )
plotHourlyLevel( x, title = NULL, units = NULL, scale = c("log", "linear"), freqMin = NULL, dbRange = NULL, toTz = "UTC", cmap = viridis_pal()(25), returnData = FALSE )
x |
a dataframe with columns |
title |
title for the plot. If |
units |
name of units for plot labeling, default is taken from common soundscape units |
scale |
one of |
freqMin |
minimum frequency for the plot range, if desired to be different than the minimum frequency of the data |
dbRange |
range of dB values to plot |
toTz |
timezone to use for the time axis (input data must be UTC). Specification must be from OlsonNames |
cmap |
color palette map to use for plot, default is viridis_pal |
returnData |
if |
a ggplot object
Taiki Sakai [email protected]
plotHourlyLevel(system.file('extdata/OLSmall.csv', package='PAMscapes'))
plotHourlyLevel(system.file('extdata/OLSmall.csv', package='PAMscapes'))
Creates a long-term spectral average (LTSA) style plot of the data, a plot where the x-axis is time and the y-axis is frequency. Color represents the magnitude of sound. In order to compress the time axis, data are binned into time chunks and the median value within that time bin is displayed
plotLTSA( x, bin = "1hour", scale = c("log", "linear"), title = NULL, freqRange = NULL, dbRange = NULL, units = NULL, facet = NULL, cmap = viridis_pal()(25), toTz = "UTC", alpha = 1, maxBins = 800, returnData = FALSE )
plotLTSA( x, bin = "1hour", scale = c("log", "linear"), title = NULL, freqRange = NULL, dbRange = NULL, units = NULL, facet = NULL, cmap = viridis_pal()(25), toTz = "UTC", alpha = 1, maxBins = 800, returnData = FALSE )
x |
a soundscape metric file that can be read in with
loadSoundscapeData, or a dataframe with |
bin |
amount of time to bin for each LTSA slice, format can
be "#Unit" e.g. |
scale |
scaling for frequency axis, one of |
title |
optional title for plot |
freqRange |
if not |
dbRange |
if not |
units |
units for plot labeling, will attempt to read them from the input |
facet |
optional column to facet by to create multiple LTSA plots in separate rows |
cmap |
color palette map to use for plot, default is viridis_pal |
toTz |
timezone to use for the time axis (input data must be UTC). Specification must be from OlsonNames |
alpha |
alpha to use for the plot fill |
maxBins |
the maximum number of time bins to create for the plot. If
|
returnData |
if |
ggplot object of the LTSA plot
Taiki Sakai [email protected]
hmd <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) # time range is too small for nice plots plotLTSA(hmd, bin='1min', title='Every Minute') plotLTSA(hmd, bin='2min', title='2 Minute Bins')
hmd <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) # time range is too small for nice plots plotLTSA(hmd, bin='1min', title='Every Minute') plotLTSA(hmd, bin='2min', title='2 Minute Bins')
Plots the distribution of summarised sound levels across frequency, either as lines of quantile levels or a heatmap showing the full distribution. Multiple PSD sources can be combined and plotted as long as they have identical frequency levels.
plotPSD( x, style = c("quantile", "density"), scale = c("log", "linear"), q = 0.5, color = "black", freqRange = NULL, dbRange = NULL, dbInt = 1, densityRange = NULL, units = "dB re: 1uPa^2/Hz", cmap = viridis_pal()(25), by = NULL, referenceLevel = NULL, facet = NULL, ncol = NULL, title = NULL, returnData = FALSE, progress = TRUE ) prepPSDData( x, freqRange = NULL, style = c("density", "quantile"), by = NULL, dbInt = 1, compression = 10000, progress = TRUE )
plotPSD( x, style = c("quantile", "density"), scale = c("log", "linear"), q = 0.5, color = "black", freqRange = NULL, dbRange = NULL, dbInt = 1, densityRange = NULL, units = "dB re: 1uPa^2/Hz", cmap = viridis_pal()(25), by = NULL, referenceLevel = NULL, facet = NULL, ncol = NULL, title = NULL, returnData = FALSE, progress = TRUE ) prepPSDData( x, freqRange = NULL, style = c("density", "quantile"), by = NULL, dbInt = 1, compression = 10000, progress = TRUE )
x |
a dataframe or list of dataframes, or file path or vector
of file paths, or the output from |
style |
character specifying plot style to create, either "quantile", "density", or a vector with both |
scale |
scale to use for frequency axis, one of "log" or "linear" |
q |
quantile to plot |
color |
color for quantile |
freqRange |
range of frequencies to plot |
dbRange |
range of dB values to plot |
dbInt |
bin interval size for density plot |
densityRange |
optional range of values for density color scale |
units |
units for dB axis of plot |
cmap |
color map to use for density plot |
by |
optional column to plot different quantile lines by, only affects
|
referenceLevel |
only used together with |
facet |
optional column to facet the plots by |
ncol |
number of columns to use when plotting with |
title |
optional title for plot |
returnData |
if |
progress |
logical flag to show progress bar |
compression |
compression factor for tdigest, lower
values are less accurate but will compute faster. Only relevant for
|
prepPSDData
is called by the plotting code, and
does not necessarily need to be called separately from
plotPSD
. Loading PSD data can be time consuming, so
it may be useful to load the data first, then it is easier
to spend time adjusting plot settings.
The output of prepPSDData
is a list with 5 elements:
- the frequency values of the input data
- the value of the "freqRange" parameter if it was supplied
- the dB values of breakpoints used for "density" plotting
- the data used for quantile plots. These are stored as "tidgest" objects serialized using as.list.tdigest, from which quantiles can be computed
- the data used fro quantile plots. These are stored as a matrix of bin counts - each column corresponds to the "frequency" output, each row corresponds to bins defined using "dbVals" as boundaries
a ggplot object for plotPSD
, see details for prepPSDData
Taiki Sakai [email protected]
psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) # Plotting only first 1000 columns for brevity plotPSD(psd[1:1000], style='density') plotPSD(psd[1:1000], style='quantile', q=.05)
psd <- loadSoundscapeData(system.file('extdata/PSDSmall.csv', package='PAMscapes')) # Plotting only first 1000 columns for brevity plotPSD(psd[1:1000], style='density') plotPSD(psd[1:1000], style='quantile', q=.05)
Plot timeseries of different values, rescaled so that multiple types of data are visible on the same plot
plotScaledTimeseries( x, columns, title = NULL, units = NULL, color = hue_pal(), cpal, lwd = 0.5, minVals = NA, relMax = 1, toTz = "UTC" )
plotScaledTimeseries( x, columns, title = NULL, units = NULL, color = hue_pal(), cpal, lwd = 0.5, minVals = NA, relMax = 1, toTz = "UTC" )
x |
a dataframe with column |
columns |
the names of the columns to plot. Values of columns will be rescaled to appear similar to range of the first column |
title |
title for the plot |
units |
name of units for plot labeling, default is taken from common soundscape units |
color |
colors to use for different lines, can either be a color palette function or a vector of color names |
cpal |
Deprecated in favor of |
lwd |
line width, either a single value or a vector of widths
matching the length of |
minVals |
minimum value for each of |
relMax |
the percentage of the maximum value for all rescaled columns relative to the first column. See Details for more info |
toTz |
timezone to use for the time axis (input data must be UTC). Specification must be from OlsonNames |
The data in the different columns
of x
may have
very different ranges, so they must be rescaled in order to create a
useful comparison plot. The default behavior is to rescale all columns
to have the same min/max range as the first column in columns
.
This means that the Y-axis values will only be accurate for the first
column, and all lines will have their minimum value at the bottom edge
of the plot and their maximum value at the top edge of the plot.
There are some cases where this full-range rescaling is not desirable.
One case is when one of the variables should have a minimum value of
zero, but the lowest value present in your data is larger than zero.
For example, wind speed might in your data might range from values of
0.5 to 3, so by default this 0.5 value would appear at the bottom of the
plot. However, it would make much more sense if the values were plotted
relative to a minimum of zero. The minVals
argument lets you control
this. The default NA
value uses the minimum of your data range, but
you can provide a value of zero (or anything else) to control the displayed
minimum.
It can also be distracting or busy to display all lines at the same relative
height, especially as the number of columns displayed grows. There are two
ways to help this. First, the lwd
parameter can be used to display
certain lines more prominently, making it easier to keep track of more
important information. Second, the relMax
can be used to control the
maximum relative height of each line plot. The default value of 1 makes each
line the same maximum height as the first column, reducing this to a value of
0.75 would make it so that all lines other than the first will not go higher than
75% of the Y-axis
a ggplot object
Taiki Sakai [email protected]
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) plotScaledTimeseries(manta, columns=c('HMD_50', 'HMD_100', 'HMD_200'))
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) plotScaledTimeseries(manta, columns=c('HMD_50', 'HMD_100', 'HMD_200'))
Plot simple timeseries of values
plotTimeseries( x, bin = "1hour", column, title = NULL, units = NULL, style = c("line", "heatmap"), q = 0, by = NULL, cmap = viridis_pal()(25), toTz = "UTC" )
plotTimeseries( x, bin = "1hour", column, title = NULL, units = NULL, style = c("line", "heatmap"), q = 0, by = NULL, cmap = viridis_pal()(25), toTz = "UTC" )
x |
a dataframe with column |
bin |
time bin for summarising data. The median of values within the same time bin will be plotted |
column |
the name of the column to plot |
title |
title for the plot, if left as default |
units |
name of units for plot labeling, default is taken from common soundscape units |
style |
one of |
q |
only valid for |
by |
only valid for |
cmap |
only valid for |
toTz |
timezone to use for the time axis (input data must be UTC). Specification must be from OlsonNames |
a ggplot object
Taiki Sakai [email protected]
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) plotTimeseries(manta, bin='1minute', column='HMD_150')
manta <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) plotTimeseries(manta, bin='1minute', column='HMD_150')
Reads in AIS data downloaded from Marine Cadastre of ship tracks that come within a certain distance of a given GPS track. Also calculates the distance to the GPS track for each AIS point
readLocalAIS(gps, aisDir, distance = 10000, timeBuff = 0)
readLocalAIS(gps, aisDir, distance = 10000, timeBuff = 0)
gps |
a dataframe with columns |
aisDir |
directory of AIS CSV files to read from |
distance |
distance in meters around the GPS track to read AIS data for |
timeBuff |
extra time (seconds) before and after the GPS points to read AIS data for. This can help create a better picture of ship activity surrounding the GPS |
a dataframe of AIS data, with additional columns related to distance to provided buoy GPS track
Taiki Sakai [email protected]
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, aisDir=system.file('extdata/ais', package='PAMscapes'), distance=20e3) str(ais)
gps <- data.frame(Latitude=c(33.2, 33.5,33.6), Longitude=c(-118.1, -118.4, -119), UTC=as.POSIXct( c('2022-04-28 05:00:00', '2022-04-28 10:00:00', '2022-04-28 20:00:00'), tz='UTC')) ais <- readLocalAIS(gps, aisDir=system.file('extdata/ais', package='PAMscapes'), distance=20e3) str(ais)
Launches a shiny app that allows users to browse the various plotting functions available to visualize soundscape data
runSoundscapeExplorer(data = NULL)
runSoundscapeExplorer(data = NULL)
data |
file path to soundscape data or data that has been loaded with loadSoundscapeData |
invisible TRUE
Taiki Sakai [email protected]
if(interactive()) { hmd <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) runSoundscapeExplorer(hmd) }
if(interactive()) { hmd <- loadSoundscapeData(system.file('extdata/MANTAExampleSmall1.csv', package='PAMscapes')) runSoundscapeExplorer(hmd) }
Subsets the full download files from Marine Cadastre to a smaller region so that they are easier to work with
subsetMarCadAIS( inDir, outDir, latRange = c(20, 50), lonRange = c(-140, -110), name = "West_", overwrite = FALSE, progress = TRUE )
subsetMarCadAIS( inDir, outDir, latRange = c(20, 50), lonRange = c(-140, -110), name = "West_", overwrite = FALSE, progress = TRUE )
inDir |
directory containing Marine Cadastre AIS CSV files to subset |
outDir |
directory to write subsetted files to |
latRange |
range of desired latitudes (decimal degrees) |
lonRange |
range of desired longitudes (decimal degrees) |
name |
prefix to append to new filenames |
overwrite |
logical flag to overwrite existing files |
progress |
logical flag to show progress bar |
invisibly return new file names
Taiki Sakai [email protected]
outDir <- tempdir() localFiles <- subsetMarCadAIS('AISData', outDir=outDir, latRange=c(20, 50), lonRange=c(-140, -110), name='West_')
outDir <- tempdir() localFiles <- subsetMarCadAIS('AISData', outDir=outDir, latRange=c(20, 50), lonRange=c(-140, -110), name='West_')