plot of large data file = large .eps file? : Graphics
This is a discussion on plot of large data file = large .eps file? within the Graphics forums in Theory and Concepts category; I have several 17 MB data-files like the following one: http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.dat I plot the data from these files into an .eps (for inclusion into a LaTeX document) using the following gnuplot commands: http://www.cs.kuleuven.ac.be/~bartv/...blem/myplot.pl The resulting .eps is http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps The problem is that due to the one-million data-points, the size of the .eps file grows up to 7.1 MB. And I have to include several of these into my LaTeX document, which makes the size of my LaTeX document quite large and also reading it with some PostScript viewer is not a pleasant experience because the figures take quite a while ...
![]() |
| | LinkBack | Thread Tools |
|
#1
| |||
| |||
| http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.dat I plot the data from these files into an .eps (for inclusion into a LaTeX document) using the following gnuplot commands: http://www.cs.kuleuven.ac.be/~bartv/...blem/myplot.pl The resulting .eps is http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps The problem is that due to the one-million data-points, the size of the .eps file grows up to 7.1 MB. And I have to include several of these into my LaTeX document, which makes the size of my LaTeX document quite large and also reading it with some PostScript viewer is not a pleasant experience because the figures take quite a while to load. So here's my question: how can I generate much smaller .eps files without loosing any visual information (the small-size figures should look identical to the large-size ones, but have a much smaller size on the hard disk so my final LaTeX document is much smaller in size and so that it doesn't take ages to load a page containing these figures). Thanks for any advice. Bart -- "Share what you know. Learn what you don't." |
|
#2
| |||
| |||
| Bart Vandewoestyne wrote: > So here's my question: how can I generate much smaller .eps files > without loosing any visual information (the small-size figures should > look identical to the large-size ones, but have a much smaller > size on the hard disk so my final LaTeX document is much smaller in > size and so that it doesn't take ages to load a page containing these > figures). Compute the dots that will actually be displayed horizontally in the printed LaTeX paper (assuming 300 or 600 dpi) and resample your data to have that many points (in log steps in your case). I don't think this really has anything to do with gnuplot, and you will have to do the resampling with an external tool. I think there are even Perl and/or Python packages to do that. -- Grüße von Peter. |
|
#3
| |||
| |||
| On 2006-01-05, Peter Weilbacher <newsspam@weilbacher.org> wrote: > > Compute the dots that will actually be displayed horizontally in the > printed LaTeX paper (assuming 300 or 600 dpi) and resample your data to > have that many points (in log steps in your case). > I don't think this really has anything to do with gnuplot, and you will > have to do the resampling with an external tool. I think there are even > Perl and/or Python packages to do that. In the meanwhile, i have found an alternative method to partly solve my problem. If i use plot "mydata.dat" u 1:2 every 100 title "" with lines lt 1 for example, then the file size of the resulting .eps is already drastically reduced. The only `bad' thing about this method is that the data-points at the beginning of course vanish... I currently `solved' that problem by simple starting my plot from 1000 or 10000 on the X-axis... but that is of course a bit of a `hack' and changes the visual output slightly... Somehow, i feel the need for some kind of plot "mydata.dat" u 1:2 every <some way to sample in log steps> but I guess this is not possible in gnuplot and I therefore have to pre-process my data-files using some kind of text-processing language like awk or perl or so...? Best wishes, Bart -- "Share what you know. Learn what you don't." |
|
#4
| |||
| |||
| In article <1136474988.789174@seven.kulnet.kuleuven.ac.be>, Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote: > >The problem is that due to the one-million data-points, the size >of the .eps file grows up to 7.1 MB. And I have to include >several of these into my LaTeX document, which makes the size of >my LaTeX document quite large and also reading it with some >PostScript viewer is not a pleasant experience because the >figures take quite a while to load. PostScript (including eps) is a vector-based description of your plot, and contains an explicit description of every one of your millions of points even though most of them lie on top of each other and cannot be seen. Pixel-based output formats like PNG, however, do not suffer from this problem. Each point is drawn onto a pixel array, but if a million points all set the same pixel to black... well it's still only 1 pixel, and doesn't take up any additional space in the output image. So for this sort of plot I recommend using PNG rather than eps. You can still include it in a LaTeX document. As you have already discovered, you can also down-sample your data set to reduce the size of an *.eps image. There is, however, a strong disadvantage to doing this in some cases. If the purpose of your plot is to discover and highlight outliers or minority regions in the distribution of points, then by down-sampling you may lose exactly the information you want to highlight. A pixel-based rendering of all points does not suffer from this effect. >So here's my question: how can I generate much smaller .eps files >without loosing any visual information (the small-size figures should >look identical to the large-size ones, but have a much smaller >size on the hard disk so my final LaTeX document is much smaller in >size and so that it doesn't take ages to load a page containing these >figures). If the small features are just as important (or more important) than the large features, then I recommend using png rather than eps output for the reasons I gave above. -- Ethan A Merritt |
|
#5
| |||
| |||
| Bart Vandewoestyne wrote: > I have several 17 MB data-files like the following one: > .... > http://www.cs.kuleuven.ac.be/~bartv/...lem/mydata.eps > ... > Thanks for any advice. Do you realy mean, that you need all the samples? I had the same problem some some years ago and some little dirty awk-skripts solved that problem fast:-) ------reduce.awk BEGIN{i=0;k=0;sw1=1000;d1=10;sw2=10000;d2=30} {i+=1;tv=t;t=$1} {if (i <=sw1) {print $0} else if ((t-tv)>=(3*t/i)) {sw1=1000+i;sw2=1000+i;print $0} else { k+=1; if (i <= sw2) {if (k>=d1) {k=0; print $0}} else if (k>=d2) {k=0; print $0} } } END{} --------- --------reduce3.awk BEGIN{i=0;k=0;sw1=1000;d1=10;sw2=10000;d2=30;sw3=50000;d3=100} {i+=1;tv=t;t=$1} {if (i <=sw1) {if (t==tv) {i-=1} else {print $0}} else if ((t-tv)>=(3*t/i)) {sw1=1000+i;sw2=1000+i;print zv1"\n"zv"\n"$0} else { k+=1; if (i <= sw3) { if (i <= sw2) {if (k>=d1) {k=0; print $0}} else if (k>=d2) {k=0; print $0}} else if (k>=d3) {k=0;print $0} } } {zv1=zv;zv=$0} END{} -------------- Espacially reduce3 is able to define three steps. In the first each value is taken, after sw1 samples each d1, after sw3 samples .... They left around 1/30 of the orginial data and the noise at the end was the same (subjectivly:-) Then also the PS-file becomes small:-) Olaf |
|
#6
| |||
| |||
| On 2006-01-05, Ethan Merritt <merritt@u.washington.edu> wrote: > > PostScript (including eps) is a vector-based description of > your plot, and contains an explicit description of every one > of your millions of points even though most of them lie on > top of each other and cannot be seen. OK. > Pixel-based output formats like PNG, however, do not suffer > from this problem. Each point is drawn onto a pixel array, > but if a million points all set the same pixel to black... > well it's still only 1 pixel, and doesn't take up any > additional space in the output image. OK. > So for this sort of plot I recommend using PNG rather than > eps. You can still include it in a LaTeX document. I see. This is something new to me... then how do you actually include your PNG into your LaTeX document? Also using a simple \includegraphics ? I did some quick googling which learned me that i might have to use `pdflatex' instead of `latex' then? And can I then still have a postscript as output with a quality that is acceptable to send in for publication in a journal? > As you have already discovered, you can also down-sample > your data set to reduce the size of an *.eps image. > There is, however, a strong disadvantage to doing this in > some cases. If the purpose of your plot is to discover and > highlight outliers or minority regions in the distribution > of points, then by down-sampling you may lose exactly the > information you want to highlight. A pixel-based rendering > of all points does not suffer from this effect. I completely agree with that, and it would indeed be nice if the outliers were still there... A possible solution would be (from the top of my head): 1) determine the amount of points to draw based on some criterium for disk-space or resolution 2) calculate the points, let's call them x_i 3) determine the maximum and minimum value 'in the neighbourhood' of these points x_i, call them max_neigh(x_i) and min_neigh(x_i) 4) do not only plot the value for x_i, but also the value max_neigh(x_i) and min_neigh(x_i) at the point x_i so you get all the outliers if they are there But this approach does seem to require quite some pre-processing on my data-file... :-( > If the small features are just as important (or more important) > than the large features, then I recommend using png rather than > eps output for the reasons I gave above. I think I'm going to experiment with the PNG-approach tomorrow. I just have to figure out how to best include .png's into my LaTeX document and thereby still keep the quality of my document good enough so that I can send it to a journal. Thanks for the advice, Bart -- "Share what you know. Learn what you don't." |
|
#7
| |||
| |||
| Bart Vandewoestyne wrote: > I see. This is something new to me... then how do you actually > include your PNG into your LaTeX document? Also using a simple > \includegraphics ? I did some quick googling which learned me > that i might have to use `pdflatex' instead of `latex' then? And > can I then still have a postscript as output with a quality that > is acceptable to send in for publication in a journal? I think that if you can, the pdflatex is the solution: it take in input also pdf which is a vector based graphics as eps but it's compressed. sou you could just use the pdf terminal that gnuplot has and you'll work in a very smoot way.... it also accepts png and jpeg (I don't use this format but I'm pretty sure it works... most of journals accept nowadays pdflatex the optput is (of course) a pdf which has the same quality of ps (at least for us - users), but some hyper-skilled might argue with this.... :-) |
|
#8
| |||
| |||
| In article <1136500080.820601@seven.kulnet.kuleuven.ac.be>, Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote: > >I see. This is something new to me... then how do you actually >include your PNG into your LaTeX document? Also using a simple >\includegraphics ? \usepackage{graphicx} .... \includegraphics[clip]{figure.png} >I did some quick googling which learned me >that i might have to use `pdflatex' instead of `latex' then? Not that I know of. I don't normally use pdflatex. >And can I then still have a postscript as output with a quality that >is acceptable to send in for publication in a journal? You can make the png image as high-resolution as you like, and force its size on the page via a parameter to \includegraphics. But yes, you are limited to whatever resolution your png image is; it doesn't scale smoothly the way PostScript does. >I think I'm going to experiment with the PNG-approach tomorrow. >I just have to figure out how to best include .png's into my >LaTeX document and thereby still keep the quality of my document >good enough so that I can send it to a journal. I have not had any problems with this, although sometimes the journal print office wants a higher resolution png image for the final press run. -- Ethan A Merritt |
|
#9
| |||
| |||
| On 2006-01-05, tv <tommaso.vinci@gmail.com> wrote: > > I think that if you can, the pdflatex is the solution: it take in input > also pdf which is a vector based graphics as eps but it's compressed. > sou you could just use the pdf terminal that gnuplot has and you'll > work in a very smoot way.... Hmm... strange... i wanted to experiment with the pdf terminal today, but apparently I can't use it because i get the error: unknown or ambiguous terminal type; type just 'set terminal' for a list and it is indeed not listed if i type gnuplot> set terminal However, i am able to ask the help page of it by gnuplot> help set terminal pdf Am i doomed of not being able to use the pdf terminal with my version of gnuplot? Or is there something I should ask our sysadmins to install so I can use the pdf terminal? I'm using: G N U P L O T Version 4.0 patchlevel 0 last modified Thu Apr 15 14:44:22 CEST 2004 System: Linux 2.6.8-2-686-smp on a Debian GNU/Linux stable box. Best wishes, Bart -- "Share what you know. Learn what you don't." |
|
#10
| |||
| |||
| On 2006-01-06, Ethan Merritt <merritt@u.washington.edu> wrote: > > \usepackage{graphicx} > ... > \includegraphics[clip]{figure.png} > >>I did some quick googling which learned me >>that i might have to use `pdflatex' instead of `latex' then? > > Not that I know of. I don't normally use pdflatex. Hmm... so the above works for you if you simply compile your document with `latex myfile.tex'? Strange... if I do this, then i get: LaTeX Error: Cannot determine size of graphic in mydata.png (no BoundingBox) If i process it with `pdflatex file.tex' then it works... Regards, Bart -- "Share what you know. Learn what you don't." |
![]() |
| Thread Tools | |
| |
| ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Error: File /cindex.asp Data size too large. | usenet | Inetserver | 1 | 09-07-2007 09:27 PM |
| plot short segments from a very large data file | usenet | Graphics | 3 | 05-17-2007 08:21 AM |
| BLOB (binary large object) file in Cobol .DAT file | usenet | cobol | 7 | 04-09-2007 08:50 PM |
| Large Data File Fitting | usenet | Graphics | 1 | 08-02-2006 04:54 PM |
| Verilog 2001 File I/O: read large file? | usenet | verilog | 15 | 04-06-2006 09:03 PM |


