Multi-threaded and file access - Smalltalk

This is a discussion on Multi-threaded and file access - Smalltalk ; Developing a multi-threaded application which accesses files. Individual threads do file IO - reading and writing. Originally, I designed this with mutexes so files would not become mangled. However, in retrospect I believe the underlying OS (Windows, Linux) should handle ...

+ Reply to Thread
Results 1 to 4 of 4

Multi-threaded and file access

  1. Default Multi-threaded and file access

    Developing a multi-threaded application which accesses files.
    Individual threads do file IO - reading and writing. Originally, I
    designed this with mutexes so files would not become mangled.
    However, in retrospect I believe the underlying OS (Windows, Linux)
    should handle file-locking issues.

    Can anyone shed some light on this for me?

    Thanks.

  2. Default Re: Multi-threaded and file access

    pineapple.link@yahoo.com writes:

    > Developing a multi-threaded application which accesses files.
    > Individual threads do file IO - reading and writing. Originally, I
    > designed this with mutexes so files would not become mangled.
    > However, in retrospect I believe the underlying OS (Windows, Linux)
    > should handle file-locking issues.
    >
    > Can anyone shed some light on this for me?


    It depends on the granularity and type of file accesses.

    The OS will "sequentialize" accesses to the file, block by block.

    But because of buffering in the user space, one I/O function call may
    involve several blocks, and if two threads (or two processes) happen
    to write overlapping blocks, the outcome can be random.

    It's indeed not enough to synchronize your threads, you also need to
    synchronize other processes. For the later, you can use file locking
    primitives provided by the system. (Have a look at flock(2) and lockf(3)).



    If you have a file that you update in whole, then using a mutex to
    exclude other threads and flocking it to exclude other processes will
    be enough to guarantee an "atomic" update of the file by a single
    thread.

    If your file has some structure and you want to allow update of
    individual elements in the file, then along with the mutex for
    threads, you will need to lockf the range to be updated. You may have
    one mutex per range to allow updating different elements in parallel
    by several threads.

    Of course, you will have to be careful if you need to update several
    elements in the file at once, eg. if you need to update the header
    along with the element.


    --
    __Pascal Bourguignon__ http://www.informatimago.com/

    "This statement is false." In Lisp: (defun Q () (eq nil (Q)))

  3. Default Re: Multi-threaded and file access

    On Sep 27, 1:50 pm, p...@informatimago.com (Pascal J. Bourguignon)
    wrote:
    > It's indeed not enough to synchronize your threads, you also need to
    > synchronize other processes. For the later, you can use file locking
    > primitives provided by the system. (Have a look at flock(2) and lockf(3)).


    I am using VisualWorks. I was unable to locate any methods in the
    system called "flock" or "lockf." I thus presume you are referring to
    some sort of OS low-level "C call" one must make to lock these files?
    I find it hard to believe a Smalltalk system like VisualWorks would
    demand that the user perform some low-level C call to lock down a
    file, since this should not be a rare or exotic operation in a multi-
    threaded environment.

    Acknowledge?

  4. Default Re: Multi-threaded and file access

    On Sep 26, 7:27 pm, pineapple.l...@yahoo.com wrote:
    > Developing a multi-threaded application which accesses files.
    > Individual threads do file IO - reading and writing.  Originally, I
    > designed this with mutexes so files would not become mangled.
    > However, in retrospect I believe the underlying OS (Windows, Linux)
    > should handle file-locking issues.
    >
    > Can anyone shed some light on this for me?



    Just some advice: this is generally a bad idea. ;-)

    Even if you synchronized everything wonderfully and you got the OS to
    work with you (which shouldn't be very difficult since most modern
    operating systems have some form of thread-safe file access), you're
    going to be shit-hammering your disk. Two threads, accessing two
    different files on opposite sides of the disk at the same time is
    going to not only be extremely inefficient, but possibly also reduce
    the lifetime of the hardware. And if you are planning on ever reading
    off something other than a hard drive (e.g. optical drive) the
    efficiency loss will be unbearable.

    Better would be to create a single thread that handles all the disk
    access for you, and let all the other threads make requests through
    it. This FileAccessThread can then optimize reads, writes, and even
    access ordering (ie, file B appears - physically - after file A on
    disk, so read it next instead of C, etc).

    This will also be much easier to maintain, debug, and extend.

    HTH,

    Jeff M.

+ Reply to Thread