filelock
Portable File Locking
Place an exclusive or shared lock on a file. It uses LockFile
on
Windows and fcntl
locks on Unix-like systems.
Installation
Install the package from CRAN as usual:
install.packages("filelock")
Install the development version from GitHub:
pak::pak("r-lib/filelock")
Usage
library(filelock)
This is R process 1, it gets an exclusive lock. If you want to lock file
myfile
, always create a separate lock file instead of placing the lock
on this file directly!
R1> lck <- lock("/tmp/myfile.lck")
This is R process 2, it fails to acquire a lock.
R2> lock("/tmp/myfile.lck", timeout = 0)
Specifying a timeout interval, before giving up:
R2> lock("/tmp/myfile.lck", timeout = 5000)
Wait indefinetely:
R2> lock("/tmp/myfile.lck", timeout = Inf)
Once R process 1 released the lock (or terminated), R process 2 can acquire the lock:
R1> unlock(lck)
R2> lock("/tmp/myfile.lck")
#> Lock on ‘/tmp/myfile.lck’
Documentation
Warning
Always use special files for locking. I.e. if you want to restict access
to a certain file, do not place the lock on this file. Create a
special file, e.g. by appending .lock
to the original file name and
place the lock on that. (The lock()
function creates the file for you,
actually, if it does not exist.) Reading from or writing to a locked
file has undefined behavior! (See more about this below at the Internals
Section.)
It is hard to determine whether and when it is safe to remove these special files, so our current recommendation is just to leave them around.
It is best to leave the special lock file empty, simply because on some OSes you cannot write to it (or read from it), once the lock is in place.
Advisory Locks:
All locks set by this package might be advisory. A process that does not respect this locking machanism may be able to read and write the locked file, or even remove it (assuming it has capabilities to do so).
Unlock on Termination:
If a process terminates (with a normal exit, a crash or on a signal), the lock(s) it is holding are automatically released.
If the R object that represents the lock (the return value of lock
)
goes out of scope, then the lock will be released automatically as soon
as the object is garbage collected. This is more of a safety mechanism,
and the user should still unlock()
locks manually, maybe using
base::on.exit()
, so that the lock is released in case of errors as
well, as soon as possible.
Special File Systems:
File locking needs support from the file system, and some non-standard
file systems do not support it. For example on network file systems like
NFS or CIFS, user mode file systems like sshfs
or ftpfs
, etc.,
support might vary. Recent Linux versions and recent NFS versions (from
version 3) do support file locking, if enabled.
In theory it is possible to simply test for lock support, using two
child processes and a timeout, but filelock
does not do this
currently.
Locking Part of a File:
While this is possible in general, filelock
does not suport it
currently. The main purpose of filelock
is to lock using special lock
files, and locking part of these is not really useful.
Internals on Unix:
On Unix (i.e. Linux, macOS, etc.), we use fcntl
to acquire and release
the locks. You can read more about it here:
https://www.gnu.org/software/libc/manual/html_node/File-Locks.html
Some important points:
- The lock is put on a file descriptor, which is kept open, until the lock is released.
- A process can only have one kind of lock set for a given file.
- When any file descriptor for that file is closed by the process, all of the locks that process holds on that file are released, even if the locks were made using other descriptors that remain open. Note that in R, using a one-shot function call to modify the file opens and closes a file descriptor to it, so the lock will be released. (This is one of the main reasons for using special lock files, instead of putting the lock on the actual file.)
- Locks are not inherited by child processes created using fork.
- For lock requests with finite timeout intervals, we set an alarm, and
temporarily install a signal handler for it. R is single threaded, so
no other code can be running, while the process is waiting to acquire
the lock. The signal handler is restored to its original value
immediately after the lock is acquired or the timeout expires. (It is
actually restored from the signal handler, so there should be no race
conditions here. However, if multiple
SIGALRM
signals are delivered via a single call to the signal handler, then alarms might get lost. Currently base R does not use theSIGALRM
signal for anything, but other packages might.)
Internals on Windows:
On Windows, LockFileEx
is used to create the lock on the file. If a
finite timeout is specified for the lock request, asynchronous
(overlapped) I/O is used to wait for the locking event with a timeout.
See more about LockFileEx
on the first hit here:
https://msdn.microsoft.com/en-us/library/aa365203.aspx
Some important points:
LockFileEx
locks are mandatory (as opposed to advisory), so indeed no other processes have access to the locked file. Actually, even the locking process has no access to it through a different file handle, than the one used for locking. In general, R cannot read from the locked file, and cannot write to it. (Although, the current R version does not fail, it just does nothing, which is quite puzzling.) Remember, always use a special lock file, instead of putting the lock on the main file, so that you are not affected by these problems.- Inherited handles do not provide access to the child process.
Code of Conduct
Please note that the fs project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
License
MIT © RStudio