Introduction
This vignette shows the typical workflow to standardize a camera trap file from Snapshot Safari.
Setup a logger
This is an optional but recommended step. If you want not only to print messages to the console, but also to save them in a file, you can use a logger.
The function create_logger
allows to create a file in
the specified location and setup the logging:
logfile <- file.path(tempdir(), "log", "logger.log")
logfile
#> [1] "/tmp/Rtmp2TFAKG/log/logger.log"
logger <- create_logger(my_logfile = logfile,
console = FALSE)
#> Create logger /tmp/Rtmp2TFAKG/log/logger.log
Read the file
First, we need to read the data. Here, we read a file that was previously written in /tmp/Rtmp2TFAKG/data_in (not shown). This dataframe represents Digikam-like data:
in_folder <- file.path(tempdir(), "data_in")
in_folder
#> [1] "/tmp/Rtmp2TFAKG/data_in"
df <- read.csv(file.path(in_folder, "digikam.csv"))
head(df, 3)
#> X Station Species DateTimeOriginal Date Time delta.time.secs
#> 1 1 G03 porcupine 2018-06-28 8:56 2018-06-28 17:38:42 0
#> 2 2 D06 kudu 2018-06-25 16:13 2018-06-25 7:18:05 0
#> 3 3 E06 springbok 2018-06-29 18:33 2018-06-29 0:53:56 353978
#> delta.time.mins delta.time.hours delta.time.days Directory
#> 1 0.0 0.0 0.0 E:/MOK/MOK_Roll1/G03
#> 2 0.0 0.0 0.0 E:/MOK/MOK_Roll1/D06
#> 3 5899.6 98.3 4.1 E:/MOK/MOK_Roll1/E06
#> FileName EXIF.Model EXIF.Make metadata_Species metadata_Number
#> 1 I_00006a.JPG E3 CUDDEBACK porcupine 1
#> 2 I_00003a.JPG E3 CUDDEBACK kudu 1
#> 3 I__00013.JPG E3 CUDDEBACK springbok 1
#> metadata_Behaviour metadata_Sex n_images metadata_young_present
#> 1 <NA> <NA> 1 <NA>
#> 2 Moving Female 1 <NA>
#> 3 Moving <NA> 1 <NA>
#> metadata_Numberofindividuals
#> 1 NA
#> 2 NA
#> 3 NA
#> HierarchicalSubject
#> 1 Species, Species|porcupine, Number|1, Number
#> 2 Species|kudu, Behaviour, Sex|Female, Number|1, Behaviour|Moving, Species, Number, Sex
#> 3 Number, Behaviour|Moving, Species, Number|1, Species|springbok, Behaviour
NB: this file is the same as the digikam
dataset
included in the package.
You can reproduce the following results by using:
data(digikam)
df <- digikam
Standardize the file
Then, we standardize the file. The function
standardize_snapshot_df
allows to standardize a single
dataframe.
This function has a number of options, but the only ones that are mandatory are:
-
df
: the dataframe to standardize -
standard_df
: the reference dataframe telling the function how to rename the columns. Here, we use the built-in datasetstandard
.
std_df <- standardize_snapshot_df(df = df,
standard_df = standard,
locationID_digikam = "MOK",
logger = logger)
#> Initial file: 22 columns, 100 rows.
#> Standardizing columns
#> Match found in column names: renaming column metadata_Numberofindividuals into metadata_NumberOfIndividuals
#> Standardizing dates/times
#> Getting location code for Digikam data
#> Fill capture info
#> Cleaning location/camera, species and columns values
#> Final file: 27 columns, 100 rows. Here is a sneak peek:
#> locationID cameraID season roll eventID snapshotName eventDate eventTime
#> MOK MOK_A09 NA 1 MOK_A09#1#1 giraffe 2018-07-08 12:15:34
#> MOK MOK_A09 NA 1 MOK_A09#1#2 springbok 2018-08-26 10:45:55
#> MOK MOK_A09 NA 1 MOK_A09#1#3 unresolvable 2018-09-02 18:11:28
#> MOK MOK_B07 NA 1 MOK_B07#1#1 zebraburchells 2018-06-28 07:49:42
#> MOK MOK_B07 NA 1 MOK_B07#1#2 gemsbok 2018-08-19 09:11:55
Here, we also use 2 optional arguments:
-
locationID_digikam
: a location code, only useful if the data was processed with Digikam: indeed, the location (reserve) cannot be determined from the dataframe alone. -
logger
: a logger created withcreate_logger
. If you did not setup a logger, you can ignore this argument.
By default, the function displays the head of the first 8 columns of the file along with numerous messages.
Write the file
The last step is to write the standardized file to a destination. For
this, we use the function write_standardized_df
.
This function has only 2 mandatory arguments:
-
df
: the file to write -
to
: the folder in which the file should be copied.
out_folder <- file.path(tempdir(), "data_out") # the folder in which to copy the file
out_folder
#> [1] "/tmp/Rtmp2TFAKG/data_out"
write_standardized_df(df = std_df,
to = out_folder,
logger = logger)
#> Creating folder /tmp/Rtmp2TFAKG/data_out
#> Writing file /tmp/Rtmp2TFAKG/data_out/MOK_SNA_R1.csv ---
Here, we also use the logger
argument (as in the data
standardization step).
The file is now written in the destination.
list.files(out_folder)
#> [1] "MOK_SNA_R1.csv"
We can check that the logger was filled:
list.files(file.path(tempdir(), "log"))
#> [1] "logger.log"
readLines(logfile)
#> [1] "INFO [2024-04-13 18:02:56] Create logger /tmp/Rtmp2TFAKG/log/logger.log"
#> [2] "INFO [2024-04-13 18:02:56] Initial file: 22 columns, 100 rows."
#> [3] "INFO [2024-04-13 18:02:56] Standardizing columns"
#> [4] "INFO [2024-04-13 18:02:57] Match found in column names: renaming column metadata_Numberofindividuals into metadata_NumberOfIndividuals"
#> [5] "INFO [2024-04-13 18:02:57] Standardizing dates/times"
#> [6] "INFO [2024-04-13 18:02:57] Getting location code for Digikam data"
#> [7] "INFO [2024-04-13 18:02:57] Fill capture info"
#> [8] "INFO [2024-04-13 18:02:57] Cleaning location/camera, species and columns values"
#> [9] "INFO [2024-04-13 18:02:57] Final file: 27 columns, 100 rows. Here is a sneak peek:"
#> [10] "locationID\tcameraID\tseason\troll\teventID\tsnapshotName\teventDate\teventTime"
#> [11] "MOK\tMOK_A09\tNA\t1\tMOK_A09#1#1\tgiraffe\t2018-07-08\t12:15:34"
#> [12] "MOK\tMOK_A09\tNA\t1\tMOK_A09#1#2\tspringbok\t2018-08-26\t10:45:55"
#> [13] "MOK\tMOK_A09\tNA\t1\tMOK_A09#1#3\tunresolvable\t2018-09-02\t18:11:28"
#> [14] "MOK\tMOK_B07\tNA\t1\tMOK_B07#1#1\tzebraburchells\t2018-06-28\t07:49:42"
#> [15] "MOK\tMOK_B07\tNA\t1\tMOK_B07#1#2\tgemsbok\t2018-08-19\t09:11:55"
#> [16] ""
#> [17] "INFO [2024-04-13 18:02:57] Creating folder /tmp/Rtmp2TFAKG/data_out"
#> [18] "INFO [2024-04-13 18:02:57] Writing file /tmp/Rtmp2TFAKG/data_out/MOK_SNA_R1.csv ---"