File locking

File locking is a mechanism that restricts access to a computer file by allowing only one user or process access at any specific time. Systems implement locking to prevent the classic interceding update scenario, which is a typical example of race condition, by enforcing the serialization of update processes to any given file. The following example illustrates the interceding update problem:

Process A reads a customer record from a file containing account information, including the customer's account balance and phone number.
Process B now reads the same record from the same file so it has its own copy.
Process A changes the account balance in its copy of the customer record and writes the record back to the file.
Process B, which still has the original stale value for the account balance in its copy of the customer record, updates the account balance and writes the customer record back to the file.
Process B has now written its stale account-balance value to the file, causing the changes made by process A to be lost.

Most operating systems support the concept of record locking, which means that individual records within any given file may be locked, thereby increasing the number of concurrent update processes. Database maintenance uses file locking, whereby it can serialize access to the entire physical file underlying a database. Although this does prevent any other process from accessing the file, it can be more efficient than individually locking a large number of regions in the file by removing the overhead of acquiring and releasing each lock.

Poor use of file locks, like any computer lock, can result in poor performance or in deadlocks. File locking may also refer to additional security applied by a computer user either by using Windows security, NTFS permissions or by installing a third party file locking software.

In mainframes

IBM pioneered file locking in 1963 for use in mainframe computers using OS/360, where it was termed "exclusive control".

In Microsoft Windows

Microsoft Windows uses three distinct mechanisms to manage access to shared files:

using share-access controls that allow applications to specify whole-file access-sharing for read, write, or delete
using byte-range locks to arbitrate read and write access to regions within a single file
by Windows file systems disallowing executing files from being opened for write or delete access

Windows inherits the semantics of share-access controls from the MS-DOS system, where sharing was introduced in MS–DOS 3.3. Thus, an application must explicitly allow sharing; otherwise an application has exclusive read, write, and delete access to the file (other types of access, such as those to retrieve the attributes of a file are allowed.)

For a file with shared access, applications may then use byte-range locking to control access to specific regions of the file. Such byte-range locks specify a region of the file (offset and length) and the type of lock (shared or exclusive). Note that the region of the file being locked is not required to have data within the file, and applications sometimes exploit this ability to implement their functionality.

For applications that use the file read/write APIs in Windows, byte-range locks are enforced (also referred to as mandatory locks) by the file systems that execute within Windows. For applications that use the file mapping APIs in Windows, byte-range locks are not enforced (also referred to as advisory locks.) Byte-range locking may also have other side-effects on the Windows system. For example, the Windows file-sharing mechanism will typically disable client side caching of a file for all clients when byte-range locks are used on any client to control file access. The client will observe slower access because read and write operations must be sent to the server where the file is stored.

Improper error-handling in an application program can lead to a scenario where a file is locked (either using "share" access or with byte-range file locking) and cannot be accessed by other applications. If so, the user may be able to restore file access by manually terminating the malfunctioning program. This is typically done through the Task Manager utility.

The sharing mode parameter in the CreateFile function used to open files determines file-sharing. Files can be opened to allow sharing the file for read, write, or delete access. Subsequent attempts to open the file must be compatible with all previously granted sharing-access to the file. When the file is closed, sharing-access restrictions are adjusted to remove the restrictions imposed by that specific file open.

Byte-range locking type is determined by the dwFlags parameter in the LockFileEx function used to lock a region of a file. The Windows API function LockFile can also be used and acquires an exclusive lock on the region of the file.

Any file that is executing on the computer system as a program (e.g., an EXE, COM, DLL, CPL or other binary program file format) is normally prevented by the file system from being opened for write or delete access, reporting a sharing violation, despite the fact that the program is not opened by any application. However, some access is still allowed. For example, a running application file can be renamed or copied (read) even when executing.

Files are accessed by applications in Windows by using file handles. These file handles can be explored with the Process Explorer utility. This utility can also be used to force-close handles without needing to terminate the application holding them.

Microsoft Windows XP and Server 2003 editions have introduced volume snapshot (VSS) capability to NTFS, allowing open files to be accessed by backup software despite any exclusive locks. However, unless software is rewritten to specifically support this feature, the snapshot will be crash consistent only, while properly supported applications can assist the operating system in creating "transactionally consistent" snapshots. Other commercial software for accessing locked files under Windows include File Access Manager and Open File Manager. These work by installing their own drivers to access the files in kernel mode.

In Unix-like systems

Unix-like operating systems (including Linux and Apple's OS X) do not normally automatically lock open files or running programs. Several kinds of file-locking mechanisms are available in different flavors of Unix, and many operating systems support more than one kind for compatibility. The two most common mechanisms are fcntl and flock(2). A third such mechanism is lockf(3), which may be separate or may be implemented using either of the first two primitives. Although some types of locks can be configured to be mandatory, file locks under Unix are by default advisory. This means that cooperating processes may use locks to coordinate access to a file among themselves, but uncooperative processes are also free to ignore locks and access the file in any way they choose. In other words, file locks lock out other file lockers only, not I/O.

Two kinds of locks are offered: shared locks and exclusive locks. In the case of fcntl, different kinds of locks may be applied to different sections (byte ranges) of a file, or else to the whole file. Shared locks can be acquired by an unlimited number of processes at the same time, but an exclusive lock can only be acquired by one process, and cannot coexist with a shared lock. To acquire a shared lock, a process must wait until no processes hold any exclusive locks. To acquire an exclusive lock, a process must wait until no processes hold either kind of lock. Unlike locks created by fcntl, those created by flock are preserved across forks, making them useful in forking servers. It is therefore possible for more than one process to hold an exclusive lock on the same file, provided these processes share a filial relationship and the exclusive lock was initially created in a single process before being duplicated across a fork.

Shared locks are sometimes called "read locks" and exclusive locks are sometimes called "write locks". However, because locks on Unix are advisory, this isn't enforced. Thus it is possible for a database to have a concept of "shared writes" vs. "exclusive writes"; for example, changing a field in place may be permitted under shared access, whereas garbage-collecting and rewriting the database may require exclusive access.

File locks apply to the actual file, rather than the file name. This is important since Unix allows multiple names to refer to the same file. Together with non-mandatory locking, this leads to great flexibility in accessing files from multiple processes. On the other hand, the cooperative locking approach can lead to problems when a process writes to a file without obeying file locks set by other processes.

For this reason, some Unix-like operating systems also offer limited support for mandatory locking.^[1] On such systems, a file whose setgid bit is on but whose group execution bit is off when that file is opened will be subject to automatic mandatory locking if the underlying filesystem supports it. However, non-local NFS partitions tend to disregard this bit.^[2] This is an exception to the earlier statement that Unix does not automatically lock open files. This strategy first originated in System V, and can be seen today in the Solaris, HP-UX, and Linux operating systems. It is not part of POSIX, however, and BSD-derived operating systems like FreeBSD, OpenBSD, NetBSD, and Apple's Mac OS X do not support it.^[3] Linux also supports mandatory locking through the special "mount(8) -o mand" parameter for filesystem mounting, but this is rarely used.

Problems

Both flock and fcntl have quirks that occasionally puzzle programmers more familiar with other operating systems.

More than one process can hold an exclusive flock on a given file if the exclusive lock was duplicated across a later fork. This simplifies coding for network servers and helps prevent race conditions, but can be confusing to the unaware.

Mandatory locks have no effect on the unlink system call. Consequently, certain programs may, effectively, circumvent mandatory locking. The authors of Advanced Programming in the UNIX Environment (Second Edition) observed that the ed editor did that indeed (page 456).

Whether and how flock locks work on network filesystems, such as NFS, is implementation dependent. On BSD systems, flock calls on a file descriptor open to a file on an NFS-mounted partition are successful no-ops. On Linux prior to 2.6.12, flock calls on NFS files would act only locally. Kernel 2.6.12 and above implement flock calls on NFS files using POSIX byte-range locks. These locks will be visible to other NFS clients that implement fcntl-style POSIX locks, but invisible to those that do not.^[4]

Lock upgrades and downgrades release the old lock before applying the new lock. If an application downgrades an exclusive lock to a shared lock while another application is blocked waiting for an exclusive lock, the latter application may get the exclusive lock and lock the first application out. This means that lock downgrades can block, which may be counterintuitive.

All fcntl locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process, even if a lock was never requested for that file descriptor. Also, fcntl locks are not inherited by a child process. The fcntl close semantics are particularly troublesome for applications that call subroutine libraries that may access files. Neither of these "bugs" occurs using real flock-style locks.

Preservation of the lock status on open file descriptors passed to another process using a Unix domain socket is implementation dependent.

AFS and buffered I/O Problems

The Andrew File System (AFS) presents an interesting case where file locking fails. If an AFS file is accessible from several different machines simultaneously, then a lock obtained on one machine isn't known to another machine. Therefore, two (or more) users on different machines can lock the same file for exclusive use, and each believes their read/write operations are being done by only their machine, when in fact, both can be writing to the same section of the same file. Therefore, "flock" or "fcntl" fails to actually lock exclusively across machines. The locks are successful on any single machine, therefore the only practical way to deal with simultaneous exclusive write access is to have all users login to the same machine. This can be controlled by requiring access programs or scripts that check that the user is on a specific host.

Another source of lock failure occurs when buffered I/O has buffers assigned in the user's local workspace, rather than in an operating system buffer pool. "fread" and "fwrite" are commonly used to do buffered I/O, and once a section of a file is read, another attempt to read that same section will, most likely, obtain the data from the local buffer. The problem is another user attached to the same file has their own local buffers, and the same thing is happening for them. "fwrite" of data obtained from the buffer by "fread" will NOT be obtaining the data from the file itself, and some other user could have changed it. Both could use "flock" for exclusive access, which prevents simultaneous writes, but since the reads are reading from the buffer and not the file itself, any data changed by user #1 can be lost by user #2 (overwritten). The best solution to this problem is to use unbuffered I/O ("read" and "write") with "flock", which also means using "lseek" instead of "fseek" and "ftell". Of course, you'll have to make adjustments for function parameters and results returned. Generally speaking, buffered I/O is unsafe with AFS when used with shared files.

In Amiga OS

In Amiga OS, a lock on a file (or directory) can be acquired using the Lock function (in the dos.library). A lock can be shared (other processes can read the file/directory, but can't modify or delete it), or exclusive so that only the process which successfully acquires the lock can access or modify the object. The lock is on the whole object and not part of it. The lock must be released with the UnLock function: unlike in Unix, the operating system does not implicitly unlock the object when the process terminates.

Lock files

Shell scripts and other programs often use a strategy similar to the use of file locking: creation of lock files, which are files whose contents are irrelevant (although often one will find the process identifier of the holder of the lock in the file) and whose sole purpose is to signal by their presence that some resource is locked. A lock file is often the best approach if the resource to be controlled is not a regular file at all, so using methods for locking files does not apply. For example, a lock file might govern access to a set of related resources, such as several different files, directories, a group of disk partitions, or selected access to higher level protocols like servers or database connections.

When using lock files, care must be taken to ensure that operations are atomic. To obtain a lock, the process must verify that the lock file does not exist and then create it, whilst preventing another process from creating it in the meantime. Various methods to do this include:

Using the lockfile command (a conditional semaphore-file creator distributed in the procmail package).
System calls that create a file, but fail if the file already exists. (System calls are available from languages such as C or C++, and shell scripts can make use of noclobber)
Using the mkdir command and checking the exit code for failure^[5]

Certain Mozilla products (such as Firefox, Thunderbird, Sunbird) use this type of file resource lock mechanism (using a temporary file named "parent.lock".)

Unlocker software

An unlocker is a utility used to determine what process is locking a file, and displays a list of processes as well as choices on what to do with the process (kill task, unlock, etc.) along with a list of file options such as delete or rename. On some Unix-like systems, utilities such as fstat and lockf can be used to inspect the state of file locks by process, by filename, or both.

On Windows systems, if a file is locked, it's possible to schedule its moving or deletion to be performed on the next reboot. This approach is typically used by installers to replace locked system files.

Version control systems

In version control systems file locking is used to prevent two users changing the same file version in parallel and then when saving, the second user to overwrite what first user changed. This is implemented by marking locked files as read-only in the file system. A user wanting to change the file performs an unlock (also called checkout) operation, and until a check-in (store) operation is done, or the lock is reverted, nobody else is allowed to unlock the file.

References

↑ "Mandatory File Locking For The Linux Operating System". Retrieved 2011-10-08.
↑ "Use Setuid, Setgid, and Sticky Bits with Server for NFS". Retrieved 2011-10-08.
↑ John Viega; Matt Messier (2003). "2.8 Locking Files". Secure Programming Cookbook for C and C++ (1st ed.). Sabastopol, CA: O'Reilly Media. p. 792. ISBN 978-0-596-00394-4. Support for mandatory locks varies greatly from one Unix variant to another. Both Linux and Solaris support mandatory locks, but Darwin, FreeBSD, NetBSD, and OpenBSD do not, even though they export the interface used by Linux and Solaris to support them. On such systems, this interface creates advisory locks. Support for mandatory locking does not extend to NFS.
↑ "Linux NFS FAQ: D. Commonly occurring error messages".
↑ "Lock your script (against parallel run)".

External links

Everything you never wanted to know about file locking, a review of the Unix file locking options and their problems (dated December 13, 2010)
File-private POSIX locks, a LWN.net article on file locks supported on Linux that behave differently from POSIX locks regarding inheritance and behavior on close

This article is issued from Wikipedia - version of the Thursday, January 14, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.