File output and buffering

This is a discussion on File output and buffering within the ADA forums in Programming Languages category; It seems to me that the file output in standard Ada library is not buffered: 1. There is no buffer-related operation in the whole library. 2. The semantics of output operations is defined in terms of the effects on external file. 3. The performance of simple test is consistent with what can be obtained in equivalent C code that flushes the channel after every operation (ie. some 15-20x slower than with default buffering). Let's suppose that I want to add buffering to my output. I can write the stream type that does the necessary magic, but how can I reuse ...

Go Back   Application Development Forum > Programming Languages > ADA

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-19-2008, 04:27 PM
Maciej Sobczak
Guest
 
Default File output and buffering

It seems to me that the file output in standard Ada library is not
buffered:
1. There is no buffer-related operation in the whole library.
2. The semantics of output operations is defined in terms of the
effects on external file.
3. The performance of simple test is consistent with what can be
obtained in equivalent C code that flushes the channel after every
operation (ie. some 15-20x slower than with default buffering).

Let's suppose that I want to add buffering to my output. I can write
the stream type that does the necessary magic, but how can I reuse the
formatting machinery that is already available in Ada.Text_IO and
related packages?

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
  #2  
Old 08-20-2008, 02:45 AM
Georg Bauhaus
Guest
 
Default Re: File output and buffering

Maciej Sobczak wrote:

> Let's suppose that I want to add buffering to my output. I can write
> the stream type that does the necessary magic, but how can I reuse the
> formatting machinery that is already available in Ada.Text_IO and
> related packages?


Some formatting procedures from {Number}_IO and from Editing
can write to a String instead of to a File_Type.
Can you stream the strings to a buffer?

There is an article on AdaPower entitlet something like
"How to access memory as a String". I think it will
illustrate reliable tricks, possibly of some use when
handling data in the "external" world.

In any case, char_array values are good for OS
procedures of names like write, read, and so on.

--
Georg Bauhaus
Y A Time Drain http://www.9toX.de
Reply With Quote
  #3  
Old 08-20-2008, 04:43 AM
Maciej Sobczak
Guest
 
Default Re: File output and buffering

On 19 Sie, 22:27, Maciej Sobczak <see.my.homep...@gmail.com> wrote:

> It seems to me that the file output in standard Ada library is not
> buffered:
> 1. There is no buffer-related operation in the whole library.
> 2. The semantics of output operations is defined in terms of the
> effects on external file.
> 3. The performance of simple test is consistent with what can be
> obtained in equivalent C code that flushes the channel after every
> operation (ie. some 15-20x slower than with default buffering).


Now I'm puzzled, because it looks like the files are written in chunks
of 32kB. In other words, nothing is written to the file until the
total output accumulated to 32kB and the step is preserved for each
future write - this indicates that the buffering is actually in use.

My original observations become questions:

1. Why there is no buffer-related operation in the whole library?
In particular: how can I *flush* the buffer?
This is very important for log files. I have discovered this exactly
because the log is not written synchronously with Put operations,
which makes it "a bit" less useful. How can I make sure that what I
Put is actually written? Closing a file after each Put does not make
much sense.

2. What about the semantics of Put?

3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who
is eating the 20x factor?

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
  #4  
Old 08-20-2008, 04:59 AM
Maciej Sobczak
Guest
 
Default Re: File output and buffering

On 20 Sie, 10:43, Maciej Sobczak <see.my.homep...@gmail.com> wrote:

I will answer myself:

> 1. Why there is no buffer-related operation in the whole library?


Heh, there is.

> In particular: how can I *flush* the buffer?


By calling Ada.Text_IO.Flush.

Which means that Georg Bauhaus fell into the trap of my confusion. :-)

Still valid question:

> 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who
> is eating the 20x factor?


--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
  #5  
Old 08-20-2008, 05:21 AM
Dmitry A. Kazakov
Guest
 
Default Re: File output and buffering

On Wed, 20 Aug 2008 01:59:52 -0700 (PDT), Maciej Sobczak wrote:

> On 20 Sie, 10:43, Maciej Sobczak <see.my.homep...@gmail.com> wrote:
>
> Still valid question:
>
>> 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who
>> is eating the 20x factor?


Because of page formatting, I suggest. You can use text streams instead.
[But don't use String'Write! Although, the newest GNAT optimized that,
AFAIK.]

BTW, buffering does not make I/O faster. It obviously does the opposite.
Certainly, you didn't mean the "last-mile" buffer held by the driver, which
is usually inaccessible. In some elder OSes you could directly write from a
user buffer mapped by the kernel, have record files etc. That was *fast*.
But then came C, Unix and Co., you know... (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
Reply With Quote
  #6  
Old 08-20-2008, 09:19 AM
Georg Bauhaus
Guest
 
Default Re: File output and buffering

Maciej Sobczak schrieb:

>> In particular: how can I *flush* the buffer?

>
> By calling Ada.Text_IO.Flush.
>
> Which means that Georg Bauhaus fell into the trap of my confusion. :-)


Sort of, but, as you say, the issue remains.
> Still valid question:
>
>> 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who
>> is eating the 20x factor?



Text_IO is demonstrably slow. There are some speedy
shortcuts in the GNAT implementation of Put (e.g. Write_Buf).
But AFAICS there is (and has to be) a lot of protecting code
around the OS calls.

Using the following stupid programs for comparison,
and using strace, I get 3370 calls to write(2) from C,
but 50_000 from both C++ and Ada. Among other things open
to speculation (or open to inspection). There are 4622
different lines in the 50_000 lines of output.

I think that if you have a formatted (constrained) string,
system I/O using fputs and flush might be a lot faster
(modulo threading issues).


#include <stdio.h>

int main()
{
char s[68 + 1] =
"************************************************* *******************";

for (int k = 0; k < 50000; ++k)
{
s[k % 68] = (char)(33 + k % 67);
fputs(s, stdout), fputc('\n', stdout);
}
return 0;
}



#include <iostream>

int main()
{
std::string s =
"************************************************* *******************";

for (int k = 0; k < 50000; ++k)
{
s[k % 68] = static_cast<char>(33 + k % 67);
std::cout << s << std::endl;
}
return 0;
}

with Ada.Text_IO;
procedure Ada_Wrt is
S: String := (1 .. 68 => '*');
begin
for K in 0 .. 50_000 - 1 loop
S(1 + K rem 68) := Character'Val(33 + K rem 67);
Ada.Text_IO.Put_Line(S);
end loop;
end Ada_Wrt;

--
Georg Bauhaus
Y A Time Drain http://www.9toX.de
Reply With Quote
  #7  
Old 08-20-2008, 10:41 AM
Maciej Sobczak
Guest
 
Default Re: File output and buffering

On 20 Sie, 15:19, Georg Bauhaus <rm.dash-bauh...@futureapps.de> wrote:

> Using the following stupid programs for comparison,
> and using strace, I get 3370 calls to write(2) from C,
> but 50_000 from both C++ and Ada.


The C++ part can be explained by the fact that you did not use it
properly.

> * * * * std::cout << s << std::endl;


Try this instead:

std::cout << s << '\n';

The difference is that std::endl performs *two* actions on the given
stream: it inserts the newline and... flushes. If you intend to only
insert the newline character, do what you mean. It is even less
typing.

(yes, 99% of "benchmarks" available on the web are broken for the same
reason)

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
  #8  
Old 08-20-2008, 10:44 AM
Maciej Sobczak
Guest
 
Default Re: File output and buffering

On 20 Sie, 11:21, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:

> BTW, buffering does not make I/O faster. It obviously does the opposite.


You must be using some strange timer or a specially distorted
definition of I/O.

Buffering allows to minimize the overhead that is there per each
physical output operation. If you can produce the same amount of data
but with less operations, then the total overhead is smaller.

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
  #9  
Old 08-20-2008, 11:39 AM
Dmitry A. Kazakov
Guest
 
Default Re: File output and buffering

On Wed, 20 Aug 2008 07:44:48 -0700 (PDT), Maciej Sobczak wrote:

> On 20 Sie, 11:21, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
> wrote:
>
>> BTW, buffering does not make I/O faster. It obviously does the opposite.

>
> You must be using some strange timer or a specially distorted
> definition of I/O.
>
> Buffering allows to minimize the overhead that is there per each
> physical output operation.


Buffering is used to make I/O in an asynchronous and/or conveyered way.
That does not make I/O faster in terms of latencies.

Any language buffer on top of numerous layered buffers, typical for an OS,
adds nothing, but overhead.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
Reply With Quote
  #10  
Old 08-21-2008, 03:10 AM
Maciej Sobczak
Guest
 
Default Re: File output and buffering

On 20 Sie, 17:39, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:

> Buffering is used to make I/O in an asynchronous and/or conveyered way.


No, it is not asynchronous. Nothing happens in the background, the
operations are only grouped. The group is (usually) transmitted in the
synchronous fashion.

I do not know what is "conveyered".

> That does not make I/O faster in terms of latencies.


It does make it faster in terms of throughput.

Note: I do not imply that throughput is more valuable for optimization
than latency - these can be different goals and usually are.

> Any language buffer on top of numerous layered buffers, typical for an OS,
> adds nothing, but overhead.


It can reduce the overhead that is associated with the number of
requests. System calls are not free and there is also a significant
latency of the medium that is better to be avoided (like network
roundtrips or disk seek times).

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 08:08 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.