seqid
was created primarily to deal with problems of computing lagged values, differences and growth rates on irregularly spaced time-series and panels (#26). flag
, fdiff
and fgrowth
do not natively support such panels because they do not pre-compute an ordering of the data but directly compute the ordering from the supplied id and time variables while providing errors for gaps and repeated time values. see flag
for computational details.
However fortunately any irregular time-series or panel-series can be expressed as a regular panel-series with a group-id created such that the time-periods within each group are consecutive.
A simple solution to applying existing functionality (flag
, fdiff
and fgrowth
) to irregular time-series and panels is thus to create a group-id that fully identifies the data together with the time variable. seqid
makes this very easy: For an irregular panel with some arbitrary gaps or repeated values in the time variable, an appropriate id variable can be generated using settransform(data, newid = seqid(time, radixorder(id, time)))
. Lags can then be computed using L(data, 1, ~newid, ~time)
etc. This way collapse maintains a balance between offering very fast computations on 99% of time series and panels (which may be unbalanced but where observations for each entity are consecutive in time), and flexibility of application.
In general, for any regularly spaced panel the identity given by identical(groupid(id, order(id, time)), seqid(time, order(id, time)))
should hold.
I note that regularly spaced panels with gaps in time (such as a panel-survey) can be handled either by seqid(..., del = gap)
or, in most cases, simply by converting the time variable to factor using qF
, which will make observations consecutive.
There are potentially other more analytical applications for seqid
...
For the opposite operation of creating a new time-variable that is consecutive in each group, see data.table::rowid
.