Convert individual-level distance data to the transect-level format required by distsamp or gdistsamp

formatDistData(distData, distCol, transectNameCol, dist.breaks,
                      occasionCol, effortMatrix)

Arguments

distData

data.frame where each row is a detected individual. Must have at least 2 columns. One for distances and the other for transect names.

distCol

character, name of the column in distData that contains the distances. The distances should be numeric.

transectNameCol

character, column name containing transect names. The transect column should be a factor.

dist.breaks

numeric vector of distance interval cutpoints. Length must equal J+1.

occasionCol

optional character. If transects were visited more than once, this can be used to format data for gdistsamp. It is the name of the column in distData that contains the occasion numbers. The occasion column should be a factor.

effortMatrix

optional matrix of 1 and 0s that is M * T in size and will allow for the insertion of NAs where the matrix = 0, indicating that a survey was not completed. When not supplied a matrix of all 1s is created since it is assumed all surveys were completed.

Details

This function creates a site (M) by distance interval (J) response matrix from a data.frame containing the detection distances for each individual and the transect names. Alternatively, if each transect was surveyed T times, the resulting matrix is M x JT, which is the format required by gdistsamp, seeunmarkedFrameGDS.

Value

An M x J or M x JT matrix containing the binned distance data. Transect names will become rownames and colnames will describe the distance intervals.

Note

It is important that the factor containing transect names includes levels for all the transects surveyed, not just those with >=1 detection. Likewise, if transects were visited more than once, the factor containing the occasion numbers should include levels for all occasions. See the example for how to add levels to a factor.

Examples

# Create a data.frame containing distances of animals detected
# along 4 transects.
dat <- data.frame(transect=gl(4,5, labels=letters[1:4]),
                  distance=rpois(20, 10))
dat
#>    transect distance
#> 1         a        5
#> 2         a        6
#> 3         a        9
#> 4         a       11
#> 5         a       13
#> 6         b        4
#> 7         b        7
#> 8         b        9
#> 9         b        8
#> 10        b       11
#> 11        c       16
#> 12        c        4
#> 13        c       11
#> 14        c        7
#> 15        c        9
#> 16        d       11
#> 17        d        7
#> 18        d        9
#> 19        d       11
#> 20        d       12

# Look at your transect names.
levels(dat$transect)
#> [1] "a" "b" "c" "d"

# Suppose that you also surveyed a transect named "e" where no animals were
# detected. You must add it to the levels of dat$transect
levels(dat$transect) <- c(levels(dat$transect), "e")
levels(dat$transect)
#> [1] "a" "b" "c" "d" "e"

# Distance cut points defining distance intervals
cp <- c(0, 8, 10, 12, 14, 18)

# Create formated response matrix
yDat <- formatDistData(dat, "distance", "transect", cp)
yDat
#>   [0,8] (8,10] (10,12] (12,14] (14,18]
#> a     2      1       1       1       0
#> b     3      1       1       0       0
#> c     2      1       1       0       1
#> d     1      1       3       0       0
#> e     0      0       0       0       0

# Now you could merge yDat with transect-level covariates and
# then use unmarkedFrameDS to prepare data for distsamp


## Example for data from multiple occasions

dat2 <- data.frame(distance=1:100, site=gl(5, 20),
                   visit=factor(rep(1:4, each=5)))
cutpt <- seq(0, 100, by=25)
y2 <- formatDistData(dat2, "distance", "site", cutpt, "visit")
umf <- unmarkedFrameGDS(y=y2, numPrimary=4, survey="point",
                        dist.breaks=cutpt, unitsIn="m")
 ## Example for datda from multiple occasions with effortMatrix
 
 dat3 <-  data.frame(distance=1:100, site=gl(5, 20), visit=factor(rep(1:4, each=5)))
 cutpt <- seq(0, 100, by=25)
 
 effortMatrix <- matrix(ncol=4, nrow=5, rbinom(20,1,0.8))
 
 y3 <- formatDistData(dat2, "distance", "site", cutpt, "visit", effortMatrix)