| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| http://www.jwdt.com/~paysan/httpd-en.html can any of you readily read this code? If I remember right, a server through inetd will not be very efficient although I could be wrong. |
|
#2
| |||
| |||
| gavino wrote: > http://www.jwdt.com/~paysan/httpd-en.html > > can any of you readily read this code? Of course. It's pretty straight-forward Forth. Don't expect Forth to look like C, it's a completely different and not related language. It's like asking a bunch of Chinese if they can read this strange language - yes, of course they can. > If I remember right, a server through inetd will not be very efficient > although I could be wrong. With the newly improved socket.fs, it should be quite simple to implement the server without inetd (haven't found the time yet). But on efficiency: I used this web server for years on my memory starved 32MB laptop instead of Apache, because it was so much faster (for local viewing of my web pages). Gforth doesn't take long to start and to compile the application, it doesn't take much memory, and all this offsets the potential performance gains from Apache in this particular environment. -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/ |
|
#3
| |||
| |||
| On Aug 16, 3:20 pm, gavino <gavcom...@gmail.com> wrote: > http://www.jwdt.com/~paysan/httpd-en.html > > can any of you readily read this code? I work with a guy who states he is relatively weaker in C than the other engineers at work. He's not dumb, but he does tend to get distracted by the details. He commented once that he admired that I and others can just glance at a screen full of C and quickly figure out what it does. I explained that when reading source code in any language I know (and languages I don't), what I'm looking for are big- picture kinds of things. If I want to know what a block of code does, I'm going to look at things like symbol names and comments. I'm going to look for control structures-- what are the predicates in "if" statements; what are the control variables in a loop; what arguments are being passed in; what values are being returned. Once you get this gross sense of what the code does, *then* you can dive into the details and figure out what you need to know. It's no different with Forth. Take this word from code you're looking at: : rework-% ( add - ) { url } base @ >r hex 0 url $@len 0 ?DO url $@ drop I + c@ dup '% = IF drop 0. url $@ I 1+ /string 2 min dup >r >number r> swap - >r 2drop ELSE 0 >r THEN over url $@ drop + c! 1+ r> 1+ +LOOP url $!len r> base ! ; I find this code readable. It's standard Forth, nothing really exotic here. There are some private words ($@len, $@, $!len), but I can guess what they do from the names and context. There is a lot of stack noise, but that par for the course for lower-level words like this. But let's say I didn't know Forth well, and I didn't know Forth idioms. When I look at this word, I start by looking at gross structure. I see the number base being changed to hex, but don't see any hex numbers, which suggests that the code either inputs or outputs hex numbers. I don't see any obvious output statements, but do see >number, so I can reasonably guess it's parsing hex numbers. I see a loop. I don't yet know what $@len is, but from it's name and from the preceding "url" I'm guessing that the loop iterates over the length of the loop. Inside the loop I see an IF, and so I look for the predicate. Here we're apparently comparing something to a % character. And since this is a URL, and since as you should know characters in a URL are encoded with % followed by two hexadecimal characters, this should give you a very strong clue as to what the code does. So you start with the big picture. And in this case, if you don't know the underlying HTTP protocol and if you don't understand how a web server works, then none of this is going to make any sense to you. Like with anything, if you don't understand the domain and the problem being solved, then understanding the code probably isn't going to help you much. Unfortunately for the novice, when you get past the big picture and dive into the details, Forth pretty much forces you deal with even the most mundane issues. Here we have lots "stack noise." In other languages, you only see this if you choose to look at the underlying details. In Forth, you see it all. This is both a strength and weakness of the language. > If I remember right, a server through inetd will not be very efficient > although I could be wrong. You're right, and this is something I (and I believe others) told you quite some time ago when you first gushed over the fact that someone wrote a web server in Forth. And noting the size and complexity of web servers like Apache, your mind seemed to latch onto the notion that by simply writing a web server in Forth, it would be faster, smaller, better, more chocolately. But that's okay. It's a repeated theme here in comp.lang.forth. There are *lots* of people here who love comparing apples with oranges, often resulting in nonsense that doesn't survive even a casual analysis. What Bernd gives you with his code is a stripped-down, bare-essentials web server for Unix or Unix-like systems, suitable for light-duty work or embedded systems. Great stuff for what it is, but modesty isn't a common attribute with Forth programmers. He claims he has "delivered an ``almost'' complete Apache clone." Ummm, no. What he has done, at best, is recreated the functionality of those specific parts of Apache that he cares about. That's a very different thing. He's only running a single site so he doesn't support virtual hosts or bother parsing the "Host" header. He doesn't care about respecting caching, so he has no code to support that. He doesn't care about mapping URLs, SSL/TLS encryption, session authentication, or content negotiation. He only cares about text/html, so other media types (like images) can be handled with another server. And while no inetd server would survive a site being Slashdotted or Digged, he doesn't care-- this is code that goes into an Internet-enabled refrigerator. The claim that his code is an "almost" complete clone of Apache bothers me for the same reason I'm bothered when someone in comp.lang.forth states with horror that a "Hello World" program in C resulted in a megabyte (or more) of code. Ignoring the usual comparisons of apples and oranges (or in the case, specific-function bare-metal embedded systems and generalized server/desktop operating systems), we have people who confuse actual code with debugging support symbols. We have people who confuse actual code with headers specifying things like libraries to link, relocation tables, and so on. Oh, and we have people who don't understand that functions like printf are an interpreter for a formatting language, and functions like puts is just a loop over a character string. But details like that don't interest the Forth-faithful. They want to hear short, punchy stories about Forth. Doesn't matter if they are true, if they can be verified by anything more than nth-person anecdote, or if they even make logical sense. My point is this: When conversations here in comp.lang.forth make meaningful comparisons and are rationally qualified, I have no problem. You tell me that Bernd's web server is a useful tool for embedded systems and it implements a useful subset of the features of Apache that Bernd cares about, and I'll agree with you. But you tell me that it's a "almost" complete clone of Apache, and I'll point out every real-world feature Apache has that Bernd's server doesn't. gavino, in the past, you would periodically write messages in comp.lang.forth asking questions of the form "can this be done in Forth" applied to web servers and databases and other applications. In response, I and others told you yes, and that if you wanted to understand this better, you should learn Forth. Well, now it looks like you're actually doing that, and you're hopefully learning both the strengths and weaknesses of Forth. Congratulations! Now it's time for the next step. It's time to get skeptical. It's time to take people's claims and hold them up to the light. It's time to look at the edges, where unqualified statements usually fall flat. You've started that-- you've correctly identified that a web server that uses inetd to handle socket communications isn't going to be efficient. Now you're starting to find the edges of the claims. Now you're starting to figure out the right questions to ask. |
|
#4
| |||
| |||
| John Passaniti <john.passaniti@gmail.com> wrote: > On Aug 16, 3:20 pm, gavino <gavcom...@gmail.com> wrote: > > http://www.jwdt.com/~paysan/httpd-en.html >[...] > What Bernd gives you with his code is a stripped-down, bare-essentials > web server for Unix or Unix-like systems, suitable for light-duty work > or embedded systems. Great stuff for what it is, but modesty isn't a > common attribute with Forth programmers. He claims he has "delivered > an ``almost'' complete Apache clone." Ummm, no. What he has done, at > best, is recreated the functionality of those specific parts of Apache > that he cares about. That's a very different thing. > > He's only running a single site so he doesn't support virtual hosts or > bother parsing the "Host" header. He doesn't care about respecting > caching, so he has no code to support that. He doesn't care about > mapping URLs, SSL/TLS encryption, session authentication, or content > negotiation. He only cares about text/html, so other media types > (like images) can be handled with another server.[...] Fortunately it is not that frugal. It reads a systems mime.types file in order to map file extensions to mime-types. Marc |
|
#5
| |||
| |||
| John Passaniti wrote: > What Bernd gives you with his code is a stripped-down, bare-essentials > web server for Unix or Unix-like systems, suitable for light-duty work > or embedded systems. Great stuff for what it is, but modesty isn't a > common attribute with Forth programmers. He claims he has "delivered > an ``almost'' complete Apache clone." Ummm, no. What he has done, at > best, is recreated the functionality of those specific parts of Apache > that he cares about. That's a very different thing. What I want to say with this exaggeration (the "almost" is even in quotes) is that I've already done too much for the specific thing I want to achieve. The thing I want to achieve is in the introduction. I very clearly specify what my goal is. If some people misunderstand that, try rereading the introduction, and try understanding what my goal is in the "Outlook" section (big hint: it mentions getting a http server into one screen). And don't forget: The 90/10 rule actually should be used as guideline what to implement: 90% of the users of bloatware use 10% of the functions (in typical bloatware, that's even a common 10%). Just implement 10% of the functions in a lightweight software, and these 90% are happy. > He's only running a single site so he doesn't support virtual hosts or > bother parsing the "Host" header. He doesn't care about respecting > caching, so he has no code to support that. He doesn't care about > mapping URLs, SSL/TLS encryption, session authentication, or content > negotiation. He only cares about text/html, so other media types > (like images) can be handled with another server. Actually, I do care about other media types, and images work fine with my web server (that's what the MIME-READ word is for). What I propose is that you often don't have to do this - it's part of the "Apache cloning" that's not really necessary for stripped down tasks. The version of httpd.fs that comes with Gforth has an example extension to interpret URLs. There's at least one way to use inetd plus an SSL wrapper to wrap such simple applications into an SSL connection, too. The header parsing function actually parses the Host header, but it does not interpret it. Like Apache, a more complete httpd.fs would have a module for virtual hosts, which then would use the Host: header to read the host-specific configuration or however this is implemented. > And while no inetd > server would survive a site being Slashdotted or Digged, he doesn't > care-- this is code that goes into an Internet-enabled refrigerator. Actually, I'm not that sure. If a server gets slashdotted, one of the basic problems is how many pages you can deliver concurrently - the other side is potentially slow (or at least used to be), so fast delivery is not the requirement (also, your uplink is jammed anyway). You can load way more Gforths+httpd.fs into memory than you can fork() Apaches. Most browsers can keep connections alive, so once they got a connection, the inetd overhead is no longer needed. A lightweight web server helps when you get slashdotted, even if under normal conditions it might be slow (inetd potentially increases latency, but Gforth loads faster than the round trip delay in the network). -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/ |
|
#6
| |||
| |||
| On Sat, 16 Aug 2008 12:20:11 -0700, gavino wrote: > http://www.jwdt.com/~paysan/httpd-en.html > > can any of you readily read this code? Welcome to Forth. > If I remember right, a server through inetd will not be very efficient > although I could be wrong. Totally depends on the process load time. Hang a Python web server framework off inetd and you'd have an obese dog with fleas. Forth can be much more lightweight. FWIW, if I was doing httpd in gforth I'd write it with much smaller words. I'm one of those 'old fart' programmers who indulge in this lame practice of putting general and stack comments in source code, to make words readable to those who don't necessariy eat, sleap, breathe, pee, dream and make love in forth. |
|
#7
| |||
| |||
| On Aug 17, 4:21 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote: > What I want to say with this exaggeration (the "almost" is even in quotes) > is that I've already done too much for the specific thing I want to > achieve. The thing I want to achieve is in the introduction. I very clearly > specify what my goal is. If some people misunderstand that, try rereading > the introduction, and try understanding what my goal is in the "Outlook" > section (big hint: it mentions getting a http server into one screen). As evidenced from my criticism, I didn't have any problem understanding your goals. I wrote that you implemented exactly what you needed and nothing more. That's not a slight. That's a recognition of the "simplest thing that could possibly work" mindset that is dominate not only in Forth, but in modern development practice. It's valid, respectable, and useful. What isn't valid, respectable, or useful is engaging in the same kind of unbound and unqualified hyperbole that often passes uncontested here in comp.lang.forth. You can try to claim it's obvious exaggeration, but even as exaggeration, it doesn't make sense. Your article is about creating a http server, not Apache. You don't have the same design goals as Apache and you don't have the same functionality and features as Apache. It would have made much more sense to compare your web server to other small "lightweight" web servers that are used in embedded systems or embedded in applications. For example, you could have compared your code to any of these web servers: http://en.wikipedia.org/wiki/Compari...ht_web_servers > And don't forget: The 90/10 rule actually should be used as guideline what > to implement: 90% of the users of bloatware use 10% of the functions (in > typical bloatware, that's even a common 10%). Just implement 10% of the > functions in a lightweight software, and these 90% are happy. What does this have to do with comparing your code to Apache? For the kinds of embedded systems and applications one would inject a http server into (ummm, the target of your goal), few if any would use Apache. You're starting with a faulty premise, and using that faulty premise to make an untrue claim. Or put another way, there are much more effective ways to evangelize design goals based on coding only what is necessary than comparing apples to oranges. What's next-- you wrote a routine that formats text, so you have an "almost" complete clone of Word? Hey, how about code that adds a column of numbers-- why that's functionally equivalent to what most people use Excel for. How long until you hook up a battery to a light bulb behind frosted glass and claim you've replaced an entire computer? After all, they are functionally equivalent because they both light up a screen. |
|
#8
| |||
| |||
| On Aug 16, 1:18 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote: > gavino wrote: > >http://www.jwdt.com/~paysan/httpd-en.html > > > can any of you readily read this code? > > Of course. It's pretty straight-forward Forth. Don't expect Forth to look > like C, it's a completely different and not related language. It's like > asking a bunch of Chinese if they can read this strange language - yes, of > course they can. > > > If I remember right, a server through inetd will not be very efficient > > although I could be wrong. > > With the newly improved socket.fs, it should be quite simple to implement > the server without inetd (haven't found the time yet). But on efficiency: I > used this web server for years on my memory starved 32MB laptop instead of > Apache, because it was so much faster (for local viewing of my web pages). > Gforth doesn't take long to start and to compile the application, it > doesn't take much memory, and all this offsets the potential performance > gains from Apache in this particular environment. > > -- > Bernd Paysan > "If you want it done right, you have to do it yourself"http://www.jwdt.com/~paysan/ sweet, did you use it to provide dynamic pages or just static sofar? |
|
#9
| |||
| |||
| On Aug 17, 1:21 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote: > John Passaniti wrote: > > What Bernd gives you with his code is a stripped-down, bare-essentials > > web server for Unix or Unix-like systems, suitable for light-duty work > > or embedded systems. Great stuff for what it is, but modesty isn't a > > common attribute with Forth programmers. He claims he has "delivered > > an ``almost'' complete Apache clone." Ummm, no. What he has done, at > > best, is recreated the functionality of those specific parts of Apache > > that he cares about. That's a very different thing. > > What I want to say with this exaggeration (the "almost" is even in quotes) > is that I've already done too much for the specific thing I want to > achieve. The thing I want to achieve is in the introduction. I very clearly > specify what my goal is. If some people misunderstand that, try rereading > the introduction, and try understanding what my goal is in the "Outlook" > section (big hint: it mentions getting a http server into one screen). > > And don't forget: The 90/10 rule actually should be used as guideline what > to implement: 90% of the users of bloatware use 10% of the functions (in > typical bloatware, that's even a common 10%). Just implement 10% of the > functions in a lightweight software, and these 90% are happy. > > > He's only running a single site so he doesn't support virtual hosts or > > bother parsing the "Host" header. He doesn't care about respecting > > caching, so he has no code to support that. He doesn't care about > > mapping URLs, SSL/TLS encryption, session authentication, or content > > negotiation. He only cares about text/html, so other media types > > (like images) can be handled with another server. > > Actually, I do care about other media types, and images work fine with my > web server (that's what the MIME-READ word is for). What I propose is that > you often don't have to do this - it's part of the "Apache cloning" that's > not really necessary for stripped down tasks. > > The version of httpd.fs that comes with Gforth has an example extension to > interpret URLs. There's at least one way to use inetd plus an SSL wrapper > to wrap such simple applications into an SSL connection, too. The header > parsing function actually parses the Host header, but it does not interpret > it. Like Apache, a more complete httpd.fs would have a module for virtual > hosts, which then would use the Host: header to read the host-specific > configuration or however this is implemented. > > > And while no inetd > > server would survive a site being Slashdotted or Digged, he doesn't > > care-- this is code that goes into an Internet-enabled refrigerator. > > Actually, I'm not that sure. If a server gets slashdotted, one of the basic > problems is how many pages you can deliver concurrently - the other side is > potentially slow (or at least used to be), so fast delivery is not the > requirement (also, your uplink is jammed anyway). You can load way more > Gforths+httpd.fs into memory than you can fork() Apaches. Most browsers can > keep connections alive, so once they got a connection, the inetd overhead > is no longer needed. A lightweight web server helps when you get > slashdotted, even if under normal conditions it might be slow (inetd > potentially increases latency, but Gforth loads faster than the round trip > delay in the network). > > -- > Bernd Paysan > "If you want it done right, you have to do it yourself"http://www.jwdt.com/~paysan/ how many k does a single connection take up roughly? I admin apache prefork boxes that seem to take 20M eachchild when running php apps. |
|
#10
| |||
| |||
| gavino wrote: > sweet, did you use it to provide dynamic pages or just static sofar? Static pages. There's this simple 4-line "dynamic content" package, but to be realistic, really useful dynamic content should at least have code to handle user logins and cookies, and that includes the refrigerator application (data base in files and directories is completely sufficient for many simpler tasks, but login is necessary if you want to do more than just monitoring, e.g. adjust the temperature or whatever). My philosophy is that you should pre-create as much "dynamic" content as possible - the number of updates is far less than the number of read accesses. The only excuse is when you can't predict how the dynamic content will look like, e.g. when you randomly insert ads, or when you give access to tons of data like web-mail interfaces or web-version control or this sort of application, and duplicating all that data is not feasible (especially in the web-VCS interface, where you can e.g. compare two different versions of the same file, which is stored as compact representation in the VCS, but has a combinatoric explosion in the web interface). What I should do to show how it works on dynamic content is to combine httpd.fs with wf.fs (the wiki format translater), and add the stuff for an actual simple wiki, like users, logins, and cookies. Certainly, if you are not logged in and just looking at the most recent versions of pages, you'll see static content in this example, too. > how many k does a single connection take up roughly? Depends on which Gforth engine you use. Execution speed is not really an issue here, so let's use gforth-itc. top tells me 1356k res, thereof 620k shared, but I don't really trust it. Examining /proc/<pid>/maps tells me that 200k are dynamically allocated by Gforth, when httpd.fs is compiled. > I admin apache prefork boxes that seem to take 20M eachchild when > running php apps. Yes, I can confirm that; that's my experience as well. On a 2GB machine, you can sustain 100 concurrent connections. With httpd.fs, you can probably sustain 10k concurrent connections on the same machine. That's enough for a hard slashdot DDoS attack (remember, bandwidth is no problem in this case, so you don't need to care about startup time and such; this is all in red alert mode then, anyway). -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/ |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.