Individual tree segmentation with several possible algorithms. The returned point cloud has a new
extra byte attribute named after the parameter attribute
independently of the algorithm used.
segment_trees(las, algorithm, attribute = "treeID", uniqueness = "incremental")
An object of class LAS or LAScatalog.
function. An algorithm of individual tree segmentation. lidR
has:
dalponte2016, watershed, li2012 and silva2016.
More experimental algorithms may be found in the package lidRplugins.
character. The returned LAS object as a new extra byte attribute (in a new column).
This parameter controls the name of the new attribute. Default is "treeID"
.
character. A method to compute a unique ID. Can be 'incremental', 'gpstime' or 'bitmerge'. See section 'Uniqueness'. This feature must be considered as 'experimental'.
If the input is a LAS
object, return a LAS
object. If the input is a
LAScatalog
, returns a LAScatalog
.
By default the tree IDs are numbered from 1 to n, n being the number of trees found. The problem
with such incremental numbering is that, while it ensures a unique ID is assigned for each tree in
a given point-cloud, it also guarantees duplication of tree IDs in different tiles or chunks when
processing a LAScatalog
. This is because each chunk/file is processed independently of the others
and potentially in parallel on different computers. Thus, the index always restarts at 1 on each
chunk/file or chunk. Worse, in a tree segmentation process, a tree that is located exactly between
2 chunks/files will have two different IDs for its two halves.
This is why we introduced some uniqueness strategies that are all imperfect and that should be seen as experimental. Please report any troubleshooting. Using a uniqueness-safe strategy ensures that trees from different files will not share the same IDs. Moreover, it also means that two halves of a tree on the edge of a processing chunk will be assigned the same ID.
Number from 0 to n. This method does not ensure uniqueness of the IDs. This is the legacy method.
This method uses the gpstime of the highest point of a tree (apex) to create a unique ID. This ID is not an integer but a 64-bit decimal number, which is suboptimal but at least it is expected to be unique if the gpstime attribute is consistent across files. If inconsistencies with gpstime are reported (for example gpstime records the week time and was reset to 0 in a coverage that takes more than a week to complete), there is a (low) probability of getting ID attribution errors.
This method uses the XY coordinates of the highest point (apex) of a tree to create a single 64-bit number with a bitwise operation. First, XY coordinates are converted to 32-bit integers using the scales and offsets of the point cloud. For example, if the apex is at (10.32, 25.64) with a scale factor of 0.01 and an offset of 0, the 32-bit integer coordinates are X = 1032 and Y = 2564. Their binary representations are, respectively, (here displayed as 16 bits) 0000010000001000 and 0000101000000100. X is shifted by 32 bits and becomes a 64-bit integer. Y is kept as-is and the binary representations are unionized into a 64-bit integer like (here displayed as 32 bit) 00000100000010000000101000000100 that is guaranteed to be unique. However R does not support 64-bit integers. The previous steps are done at C++ level and the 64-bit binary representation is reinterpreted into a 64-bit decimal number to be returned in R. The IDs thus generated are somewhat weird. For example, the tree ID 00000100000010000000101000000100 which is 67635716 if interpreted as an integer becomes 3.34164837074751323479078607289E-316 if interpreted as a decimal number. This is far from optimal but at least it is guaranteed to be unique if all files have the same offsets and scale factors.
All the proposed options are suboptimal because they either do not guarantee uniqueness in all cases (inconsistencies in the collection of files), or they imply that IDs are based on non-integers or meaningless numbers. But at least it works and deals with some of the limitations of R.
This section appears in each function that supports a LAScatalog as input.
In lidR
when the input of a function is a LAScatalog the
function uses the LAScatalog processing engine. The user can modify the engine options using
the available options. A careful reading of the
engine documentation is recommended before processing LAScatalogs
. Each
lidR
function should come with a section that documents the supported engine options.
The LAScatalog
engine supports .lax
files that significantly improve the computation
speed of spatial queries using a spatial index. Users should really take advantage a .lax
files,
but this is not mandatory.
Supported processing options for a LAScatalog
(in bold). For more details see the
LAScatalog engine documentation:
chunk size: How much data is loaded at once.
chunk buffer*: Mandatory to get a continuous output without edge effects. The buffer is always removed once processed and will never be returned either in R or in files.
chunk alignment: Align the processed chunks.
progress: Displays a progression estimation.
output files*: Mandatory because the output is likely to be too big to be returned
in R and needs to be written in las/laz files. Supported templates are {XLEFT}
, {XRIGHT}
,
{YBOTTOM}
, {YTOP}
, {XCENTER}
, {YCENTER}
{ID}
and, if
chunk size is equal to 0 (processing by file), {ORIGINALFILENAME}
.
select: The function will write files equivalent to the original ones. Thus select = "*"
and cannot be changed.
filter: Read only points of interest.
# NOT RUN {
LASfile <- system.file("extdata", "MixedConifer.laz", package="lidR")
las <- readLAS(LASfile, select = "xyz", filter = "-drop_z_below 0")
# Using Li et al. (2012)
las <- segment_trees(las, li2012(R = 3, speed_up = 5))
#plot(las, color = "treeID")
# }
Run the code above in your browser using DataLab