Objectmix
Tags Register Mark Forums Read

plot of large data file = large .eps file? : Graphics

This is a discussion on plot of large data file = large .eps file? within the Graphics forums in Theory and Concepts category; I have several 17 MB data-files like the following one: http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.dat I plot the data from these files into an .eps (for inclusion into a LaTeX document) using the following gnuplot commands: http://www.cs.kuleuven.ac.be/~bartv/...blem/myplot.pl The resulting .eps is http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps The problem is that due to the one-million data-points, the size of the .eps file grows up to 7.1 MB. And I have to include several of these into my LaTeX document, which makes the size of my LaTeX document quite large and also reading it with some PostScript viewer is not a pleasant experience because the figures take quite a while ...


Object Mix > Theory and Concepts > Graphics > plot of large data file = large .eps file?

Reply

 

LinkBack Thread Tools
  #1  
Old 01-05-2006, 10:29 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default plot of large data file = large .eps file?

I have several 17 MB data-files like the following one:

http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.dat

I plot the data from these files into an .eps (for inclusion into
a LaTeX document) using the following gnuplot commands:

http://www.cs.kuleuven.ac.be/~bartv/...blem/myplot.pl

The resulting .eps is

http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps

The problem is that due to the one-million data-points, the size
of the .eps file grows up to 7.1 MB. And I have to include
several of these into my LaTeX document, which makes the size of
my LaTeX document quite large and also reading it with some
PostScript viewer is not a pleasant experience because the
figures take quite a while to load.

So here's my question: how can I generate much smaller .eps files
without loosing any visual information (the small-size figures should
look identical to the large-size ones, but have a much smaller
size on the hard disk so my final LaTeX document is much smaller in
size and so that it doesn't take ages to load a page containing these
figures).

Thanks for any advice.

Bart

--
"Share what you know. Learn what you don't."
  #2  
Old 01-05-2006, 11:09 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?

Bart Vandewoestyne wrote:

> So here's my question: how can I generate much smaller .eps files
> without loosing any visual information (the small-size figures should
> look identical to the large-size ones, but have a much smaller
> size on the hard disk so my final LaTeX document is much smaller in
> size and so that it doesn't take ages to load a page containing these
> figures).


Compute the dots that will actually be displayed horizontally in the
printed LaTeX paper (assuming 300 or 600 dpi) and resample your data to
have that many points (in log steps in your case).
I don't think this really has anything to do with gnuplot, and you will
have to do the resampling with an external tool. I think there are even
Perl and/or Python packages to do that.
--
Grüße von
Peter.
  #3  
Old 01-05-2006, 11:37 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?

On 2006-01-05, Peter Weilbacher <newsspam@weilbacher.org> wrote:
>
> Compute the dots that will actually be displayed horizontally in the
> printed LaTeX paper (assuming 300 or 600 dpi) and resample your data to
> have that many points (in log steps in your case).
> I don't think this really has anything to do with gnuplot, and you will
> have to do the resampling with an external tool. I think there are even
> Perl and/or Python packages to do that.


In the meanwhile, i have found an alternative method to partly
solve my problem. If i use

plot "mydata.dat" u 1:2 every 100 title "" with lines lt 1

for example, then the file size of the resulting .eps is already
drastically reduced. The only `bad' thing about this method is
that the data-points at the beginning of course vanish...
I currently `solved' that problem by simple starting my plot from
1000 or 10000 on the X-axis... but that is of course a bit of a `hack'
and changes the visual output slightly...

Somehow, i feel the need for some kind of

plot "mydata.dat" u 1:2 every <some way to sample in log steps>

but I guess this is not possible in gnuplot and I therefore have
to pre-process my data-files using some kind of text-processing
language like awk or perl or so...?

Best wishes,
Bart

--
"Share what you know. Learn what you don't."
  #4  
Old 01-05-2006, 12:08 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?

In article <1136474988.789174@seven.kulnet.kuleuven.ac.be>,
Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
>
>The problem is that due to the one-million data-points, the size
>of the .eps file grows up to 7.1 MB. And I have to include
>several of these into my LaTeX document, which makes the size of
>my LaTeX document quite large and also reading it with some
>PostScript viewer is not a pleasant experience because the
>figures take quite a while to load.



PostScript (including eps) is a vector-based description of
your plot, and contains an explicit description of every one
of your millions of points even though most of them lie on
top of each other and cannot be seen.

Pixel-based output formats like PNG, however, do not suffer
from this problem. Each point is drawn onto a pixel array,
but if a million points all set the same pixel to black...
well it's still only 1 pixel, and doesn't take up any
additional space in the output image.

So for this sort of plot I recommend using PNG rather than
eps. You can still include it in a LaTeX document.

As you have already discovered, you can also down-sample
your data set to reduce the size of an *.eps image.
There is, however, a strong disadvantage to doing this in
some cases. If the purpose of your plot is to discover and
highlight outliers or minority regions in the distribution
of points, then by down-sampling you may lose exactly the
information you want to highlight. A pixel-based rendering
of all points does not suffer from this effect.

>So here's my question: how can I generate much smaller .eps files
>without loosing any visual information (the small-size figures should
>look identical to the large-size ones, but have a much smaller
>size on the hard disk so my final LaTeX document is much smaller in
>size and so that it doesn't take ages to load a page containing these
>figures).


If the small features are just as important (or more important)
than the large features, then I recommend using png rather than
eps output for the reasons I gave above.

--
Ethan A Merritt
  #5  
Old 01-05-2006, 12:52 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?

Bart Vandewoestyne wrote:

> I have several 17 MB data-files like the following one:
>

....

> http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps
>

...

> Thanks for any advice.


Do you realy mean, that you need all the samples?

I had the same problem some some years ago and some little dirty awk-skripts
solved that problem fast:-)

------reduce.awk
BEGIN{i=0;k=0;sw1=1000;d1=10;sw2=10000;d2=30}
{i+=1;tv=t;t=$1}
{if (i <=sw1) {print $0} else
if ((t-tv)>=(3*t/i)) {sw1=1000+i;sw2=1000+i;print $0} else
{
k+=1;
if (i <= sw2) {if (k>=d1) {k=0; print $0}}
else if (k>=d2) {k=0; print $0}
}
}
END{}
---------

--------reduce3.awk
BEGIN{i=0;k=0;sw1=1000;d1=10;sw2=10000;d2=30;sw3=50000;d3=100}
{i+=1;tv=t;t=$1}
{if (i <=sw1) {if (t==tv) {i-=1} else {print $0}} else
if ((t-tv)>=(3*t/i)) {sw1=1000+i;sw2=1000+i;print zv1"\n"zv"\n"$0} else
{
k+=1;
if (i <= sw3) {
if (i <= sw2) {if (k>=d1) {k=0; print $0}}
else if (k>=d2) {k=0; print $0}}
else if (k>=d3) {k=0;print $0}
}
}
{zv1=zv;zv=$0}
END{}
--------------

Espacially reduce3 is able to define three steps. In the first each value is
taken, after sw1 samples each d1, after sw3 samples ....

They left around 1/30 of the orginial data and the noise at the end was the
same (subjectivly:-) Then also the PS-file becomes small:-)

Olaf
  #6  
Old 01-05-2006, 05:28 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default [OT] Re: plot of large data file = large .eps file?

On 2006-01-05, Ethan Merritt <merritt@u.washington.edu> wrote:
>
> PostScript (including eps) is a vector-based description of
> your plot, and contains an explicit description of every one
> of your millions of points even though most of them lie on
> top of each other and cannot be seen.


OK.

> Pixel-based output formats like PNG, however, do not suffer
> from this problem. Each point is drawn onto a pixel array,
> but if a million points all set the same pixel to black...
> well it's still only 1 pixel, and doesn't take up any
> additional space in the output image.


OK.

> So for this sort of plot I recommend using PNG rather than
> eps. You can still include it in a LaTeX document.


I see. This is something new to me... then how do you actually
include your PNG into your LaTeX document? Also using a simple
\includegraphics ? I did some quick googling which learned me
that i might have to use `pdflatex' instead of `latex' then? And
can I then still have a postscript as output with a quality that
is acceptable to send in for publication in a journal?

> As you have already discovered, you can also down-sample
> your data set to reduce the size of an *.eps image.
> There is, however, a strong disadvantage to doing this in
> some cases. If the purpose of your plot is to discover and
> highlight outliers or minority regions in the distribution
> of points, then by down-sampling you may lose exactly the
> information you want to highlight. A pixel-based rendering
> of all points does not suffer from this effect.


I completely agree with that, and it would indeed be nice if the
outliers were still there... A possible solution would be (from
the top of my head):

1) determine the amount of points to draw based on some criterium
for disk-space or resolution

2) calculate the points, let's call them x_i

3) determine the maximum and minimum value 'in the neighbourhood' of
these points x_i, call them max_neigh(x_i) and min_neigh(x_i)

4) do not only plot the value for x_i, but also the value
max_neigh(x_i) and min_neigh(x_i) at the point x_i so you get all
the outliers if they are there


But this approach does seem to require quite some pre-processing
on my data-file... :-(

> If the small features are just as important (or more important)
> than the large features, then I recommend using png rather than
> eps output for the reasons I gave above.


I think I'm going to experiment with the PNG-approach tomorrow.
I just have to figure out how to best include .png's into my
LaTeX document and thereby still keep the quality of my document
good enough so that I can send it to a journal.

Thanks for the advice,
Bart

--
"Share what you know. Learn what you don't."
  #7  
Old 01-05-2006, 06:03 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?


Bart Vandewoestyne wrote:

> I see. This is something new to me... then how do you actually
> include your PNG into your LaTeX document? Also using a simple
> \includegraphics ? I did some quick googling which learned me
> that i might have to use `pdflatex' instead of `latex' then? And
> can I then still have a postscript as output with a quality that
> is acceptable to send in for publication in a journal?


I think that if you can, the pdflatex is the solution: it take in input
also pdf which is a vector based graphics as eps but it's compressed.
sou you could just use the pdf terminal that gnuplot has and you'll
work in a very smoot way....

it also accepts png and jpeg (I don't use this format but I'm pretty
sure it works...

most of journals accept nowadays pdflatex
the optput is (of course) a pdf which has the same quality of ps (at
least for us - users), but some hyper-skilled might argue with this....
:-)

  #8  
Old 01-05-2006, 11:41 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: [OT] Re: plot of large data file = large .eps file?

In article <1136500080.820601@seven.kulnet.kuleuven.ac.be>,
Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
>
>I see. This is something new to me... then how do you actually
>include your PNG into your LaTeX document? Also using a simple
>\includegraphics ?


\usepackage{graphicx}
....
\includegraphics[clip]{figure.png}

>I did some quick googling which learned me
>that i might have to use `pdflatex' instead of `latex' then?


Not that I know of. I don't normally use pdflatex.

>And can I then still have a postscript as output with a quality that
>is acceptable to send in for publication in a journal?


You can make the png image as high-resolution as you like,
and force its size on the page via a parameter
to \includegraphics. But yes, you are limited to whatever
resolution your png image is; it doesn't scale smoothly the
way PostScript does.

>I think I'm going to experiment with the PNG-approach tomorrow.
>I just have to figure out how to best include .png's into my
>LaTeX document and thereby still keep the quality of my document
>good enough so that I can send it to a journal.


I have not had any problems with this, although sometimes the
journal print office wants a higher resolution png image for the final
press run.


--
Ethan A Merritt
  #9  
Old 01-06-2006, 04:13 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: plot of large data file = large .eps file?

On 2006-01-05, tv <tommaso.vinci@gmail.com> wrote:
>
> I think that if you can, the pdflatex is the solution: it take in input
> also pdf which is a vector based graphics as eps but it's compressed.
> sou you could just use the pdf terminal that gnuplot has and you'll
> work in a very smoot way....


Hmm... strange... i wanted to experiment with the pdf terminal
today, but apparently I can't use it because i get the error:

unknown or ambiguous terminal type; type just 'set terminal' for a list

and it is indeed not listed if i type

gnuplot> set terminal

However, i am able to ask the help page of it by

gnuplot> help set terminal pdf

Am i doomed of not being able to use the pdf terminal with my
version of gnuplot? Or is there something I should ask our
sysadmins to install so I can use the pdf terminal?

I'm using:

G N U P L O T
Version 4.0 patchlevel 0
last modified Thu Apr 15 14:44:22 CEST 2004
System: Linux 2.6.8-2-686-smp

on a Debian GNU/Linux stable box.

Best wishes,
Bart

--
"Share what you know. Learn what you don't."
  #10  
Old 01-06-2006, 04:38 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: [OT] Re: plot of large data file = large .eps file?

On 2006-01-06, Ethan Merritt <merritt@u.washington.edu> wrote:
>
> \usepackage{graphicx}
> ...
> \includegraphics[clip]{figure.png}
>
>>I did some quick googling which learned me
>>that i might have to use `pdflatex' instead of `latex' then?

>
> Not that I know of. I don't normally use pdflatex.


Hmm... so the above works for you if you simply compile your
document with `latex myfile.tex'? Strange... if I do this, then
i get:

LaTeX Error: Cannot determine size of graphic in mydata.png (no BoundingBox)

If i process it with `pdflatex file.tex' then it works...

Regards,
Bart

--
"Share what you know. Learn what you don't."
Reply

Thread Tools


Similar Threads

Thread Thread Starter Forum Replies Last Post
Error: File /cindex.asp Data size too large. usenet Inetserver 1 09-07-2007 09:27 PM
plot short segments from a very large data file usenet Graphics 3 05-17-2007 08:21 AM
BLOB (binary large object) file in Cobol .DAT file usenet cobol 7 04-09-2007 08:50 PM
Large Data File Fitting usenet Graphics 1 08-02-2006 04:54 PM
Verilog 2001 File I/O: read large file? usenet verilog 15 04-06-2006 09:03 PM


All times are GMT -5. The time now is 09:00 AM.

Managed by Infnx Pvt Ltd.