| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| Hi friends of Corba, I have another beginner question: I have implemented a distributed application using CORBA....(for those who know, that's not the one with MFC...). Everything works reasonably fine, but of course, in this system it can be, that a Servant that has bound at the nameservice, crashes for some reason. In this case, of course, the reference to the "dead" servant still exists in the nameservice and other clients may attempt to invoke a request on it. At the moment, when a client makes a request on a dead servant, it takes a certain amount of time (lets say 40sec) before it gets a CORBA -exception. This behaviour is not acceptable, that is why: 1.) Is there a possibilty that when a servant crashes (no exception, but e.g no electricity), it will automatically be unbound???? 2.) If no (which I expect), how can I decrease the time period before the CORBA-Exception is thrown?? For instanrce, it would be ok, if this exception came within 100ms or so... 3.) I think the best solution would be something like this: If (servant exists){ make the request } else { unbind the corpse...;-) } I know, there is a CORBA-funktion like non_existent(), but unfortunately this function is not part of the CORBA Minimal Standard. Does anybody have an idea, how I can if the reference is still valid before making the actual request?????? I hope, somebody out there an help me or at least give me a hint....Thanx in advance and merry christmas!!! |
|
#2
| |||
| |||
| Hi Max, Assuming timeouts are not possible, the approach I mention below might well work for you. If you have timeouts available, then on the client side you could simply reduce it to 100ms, say. If no response it received within that time interval the client can catch the exception and take the appropriate action. If timeouts are not an option to you.... others will have different views on this but your problem is not new. Essentially, you need to ensure the integrity of the objects bound to the Name Service. ie. you need some means of checking the status of each object and unbinding it if needs be. Ideally, you'd like to be able to unregister your objects upon deletion. i.e. in the dtor for your servants you can put code in there to unregister them. That deals with normal operation. However you're more concerned with un-graceful servant crash...whatever. In that case you're getting no dtor code being called and stale/bad refs are knocking around the NS. You could avail of callbacks to achieve the desired effect. I mean you could have a method in your servants which "call back" on an object every N seconds. This is effectively pinging and all the baggage that comes with it. n/w traffic etc. It's an option to you all the same. Another approach would be for you to have some sort of monitor process. This process iterates over the entries in the name service every N seconds and issues some sort of "ping" method on the object in question. When the ping method receives no response or throws an exception, your client can dedude the object is no longer valid and needs to be removed/unbound from NS. This wil reduce.. but not erradicate, the possibilty of your clients getting bogus references from the Name Service. Both of these involve code. there's no corba silver bullet here that addreses your problem. One of these solutions might be feasible to you. Alternatively you could take the advice of ciaran mc hale who will probably blow holes in my posting ha ha.Aside: I wonder if the latest version of the CORBA spec addresses this in any shape of form. It's an old problem. Cheers Graham |
|
#3
| |||
| |||
| Hi, see inline, Ke maxpower24@gmx.net wrote: > Hi friends of Corba, > > I have another beginner question: > > I have implemented a distributed application using CORBA....(for those > who know, that's not the one with MFC...). Everything works reasonably > fine, but of course, in this system it can be, that a Servant that has > bound at the nameservice, crashes for some reason. > > In this case, of course, the reference to the "dead" servant still > exists in the nameservice and other clients may attempt to invoke a > request on it. > > At the moment, when a client makes a request on a dead servant, it > takes a certain amount of time (lets say 40sec) before it gets a CORBA > -exception. This behaviour is not acceptable, that is why: > If it took 40 secs to report this problem, then, it is not the "servant" was not available, but the "link level connection" was not available (e.g. you unplugged the cable connection or server host was powered off). > 1.) Is there a possibilty that when a servant crashes (no exception, > but e.g no electricity), > it will automatically be unbound???? > A servant can't crash. I believe you mean a "server process" crashes or the "server host" crashes (or power down), rather than "servant" crashes. In case of "server process" crashes, you should get a CORBA OBJECT_NOT_EXISTS exception immediately. Becase transport level can report the failure when it unable to initiate the connection to the target server. In case of "server host" crashes (or power off), it is same as a link layer failure (same as you unplug the cable connection). It depends on your client side TCP engine's TCP connection timeout. On unix, it took few ten seconds (40, as you said above) to timeout. On Windows, it took few seconds (I remember it is only 2 seconds) for you to see that exception. Also, some ORB may have other rebind mechanism (such as VisiBroker will try to query osagent for an alternative server) which may increase the timeout value in link layer failure case. > 2.) If no (which I expect), how can I decrease the time period > before the CORBA-Exception > is thrown?? For instanrce, it would be ok, if this exception > came within 100ms or so... > a. Reconfigure your link layer (reconfig unix kernel, or windows registry). or b. set request timeout. however, the side effect is, you may get a TIMEOUT exception, even the connection and the server and the object implementation are perfectly ok (but take longer time to process the request). Regards, Ke > 3.) I think the best solution would be something like this: > If (servant exists){ > make the request > } else { > unbind the corpse...;-) > } > > I know, there is a CORBA-funktion like non_existent(), but > unfortunately this function > is not part of the CORBA Minimal Standard. Does anybody have > an idea, how I can > if the reference is still valid before making the actual > request?????? > > I hope, somebody out there an help me or at least give me a > hint....Thanx in advance and merry christmas!!! |
|
#4
| |||
| |||
| Ke Jin wrote: > In case of "server process" crashes, you should get a CORBA > OBJECT_NOT_EXISTS exception immediately. Becase transport level can > report the failure when it unable to initiate the connection to the > target server. That's not right. OBJECT_NOT_EXISTS is authoritative, and can be returned only after consultation with the server. If the server cannot be reached, you should get COMM_FAILURE or some such. Cheers, Michi. |
|
#5
| |||
| |||
| Michi Henning wrote: > Ke Jin wrote: > > > In case of "server process" crashes, you should get a CORBA > > OBJECT_NOT_EXISTS exception immediately. Becase transport level can > > report the failure when it unable to initiate the connection to the > > target server. > > That's not right. OBJECT_NOT_EXISTS is authoritative, and can be returned > only after consultation with the server. If the server cannot be reached, > you should get COMM_FAILURE or some such. > Whatever, the point is not about the type of exception, but the latency of a some such exception. As said, by default, in case of server crashed before request sending, client should get an exception *immediately*, instead of after 40 seconds. Regards, Ke > Cheers, > > Michi. |
|
#6
| |||
| |||
| > Both of these involve code. there's no corba silver bullet here that > addreses your problem. One of these solutions might be feasible to you. What kind of bullets are the exceptions "COMM_FAILURE" and "OBJECT_NOT_EXISTS"? > Aside: I wonder if the latest version of the CORBA spec addresses this > in any shape of form. It's an old problem. Are there any well-know solutions from fault tolerance technology available? - http://www.ociweb.com/cnb/CORBANewsBrief-200301.html - http://citeseer.ist.psu.edu/368321.html - http://www.cs.wustl.edu/~schmidt/cor...-reliable.html - http://en.wikipedia.org/wiki/Crash-only_software Regards, Markus |
|
#7
| |||
| |||
| A questionable assumption behind some fault tolerance technologies in addressing this issue (namely large timeout value under link layer disconnect) is: broken link layer connection is not only real-time detectable, but also a fault to be handled by up layer "fault tolerance technology". This is not necessary true for packet switch network (such as ethernet). Transient link layer disconnection in packet switch network is neither necessary real-time detectable nor necessary to be a fault for transport or application layer on top of it. For instance, if the unplugged cable is put back (or a temporary off-line router is put back on line) before the timeout (40 seconds observed by the original post), transport and application layer should continue to function without notice this transient link layer problem. If an application really need real-time scale link layer disconnection report and handling, it should consider to use circuit link instead of setting a real-time scale (e.g. few hundred milli seconds) timeout on packet switch link. This (namely set a real-time scale timeout in packet switch network) would likely introduce more problems than it solves (such as pre-mature and very vulnerable transport connection). Regards, Ke Markus Elfring wrote: > > Both of these involve code. there's no corba silver bullet here that > > addreses your problem. One of these solutions might be feasible to you. > > What kind of bullets are the exceptions "COMM_FAILURE" and "OBJECT_NOT_EXISTS"? > > > > Aside: I wonder if the latest version of the CORBA spec addresses this > > in any shape of form. It's an old problem. > > Are there any well-know solutions from fault tolerance technology available? > - http://www.ociweb.com/cnb/CORBANewsBrief-200301.html > - http://citeseer.ist.psu.edu/368321.html > - http://www.cs.wustl.edu/~schmidt/cor...-reliable.html > - http://en.wikipedia.org/wiki/Crash-only_software > > Regards, > Markus |
|
#8
| |||
| |||
| Ke Jin wrote: > > If an application really need real-time scale link layer disconnection > report and handling, it should consider to use circuit link instead of > setting a real-time scale (e.g. few hundred milli seconds) timeout on > packet switch link. This (namely set a real-time scale timeout in > packet switch network) would likely introduce more problems than it > solves (such as pre-mature and very vulnerable transport connection). In part, the problem isn't just caused by the difficulty of detecting network failure in a timely manner, but also by use of the naming service in the first place. What we have here is a stateful server that maintains a bunch of objects, and a stateful naming service that maintains a bunch of IORs to these objects. So, the server and the naming service maintain redundant state, namely the notion of which objects exist at any given time. Of course, the server and the naming service can fail independently, which leaves us with the problem that their respective state can go out of sync, and how to recover if it does. So, this is a design problem, as much as anything else. Any CORBA system that dynamically updates the naming service in this fashion is vulnerable to the problem and should probably be redesigned. Instead of putting every IOR there is into the naming service, the naming service should contain only a few key IORs that are needed to get off the ground (and that denote essentially singleton objects). Then, instead of the naming service, add a lookup interface to the actual server. Problem solved: no state can go out of sync, and nothing ever needs cleaning up. IMO, overall, the naming service is a pretty bad idea. Apart from the quite horribly botched IDL design, the service is pragmatically not very useful. At most, I'd use it to locate a handful of key IORs that clients need to get off the ground. For everything else, it's better to build the functionality into the server itself, especially when the set of IORs that clients need to look up is not stable and changes all the time. Cheers, Michi. |
|
#9
| |||
| |||
| Michi Henning wrote: > Ke Jin wrote: > > > > If an application really need real-time scale link layer disconnection > > report and handling, it should consider to use circuit link instead of > > setting a real-time scale (e.g. few hundred milli seconds) timeout on > > packet switch link. This (namely set a real-time scale timeout in > > packet switch network) would likely introduce more problems than it > > solves (such as pre-mature and very vulnerable transport connection). > > In part, the problem isn't just caused by the difficulty of detecting > network failure in a timely manner, but also by use of the naming > service in the first place. The observed long latency by the original post is not relevant to use of naming service. The discussed long latency is purely a nature of packet switch network. You would have exactly same error report latency if you try to telnet a host which is powered off (or disconnected from network). > What we have here is a stateful server > that maintains a bunch of objects, and a stateful naming service > that maintains a bunch of IORs to these objects. So, the server > and the naming service maintain redundant state, namely the notion > of which objects exist at any given time. > > Of course, the server and the naming service can fail independently, > which leaves us with the problem that their respective state can go > out of sync, and how to recover if it does. > I am confused. What state is to be sync'ed between naming service and stateful application implemenation object? And why you don't need this kind of synchronization if application object is stateless? Do you see other similar directory services need such state synchronization? For instance, does DNS server sync its state (namly domain-ip mapping) with a stateful host OS (such as number of processes and their state)? > So, this is a design problem, as much as anything else. Any CORBA > system that dynamically updates the naming service in this fashion > is vulnerable to the problem and should probably be redesigned. I don't see using naming service could cause the large latency (40 secs, questioned in the initial post) of error reporting, nor see without using naming service would avoid this latency. This latency is irrlevant to whether and how naming service is used, irrelvant to whether object is stateful or stateless, but purely a nature of packet switch network and link layer error detect algorithm/setting of transport layer atop. > Instead of putting every IOR there is into the naming service, the > naming service should contain only a few key IORs that are needed to > get off the ground (and that denote essentially singleton objects). > Then, instead of the naming service, add a lookup interface to the > actual server. Problem solved: no state can go out of sync, and nothing > ever needs cleaning up. > > IMO, overall, the naming service is a pretty bad idea. I wouldn't comment on the goodness or badness of the idea of naming service, but would like to point out that naming service is just an OMG abstraction of various old and long existent directory services, such as DCE naming service, ONC naming service (yellowpage) and even IETF DNS. They share the same nature under the same circumstance. Such as, you would get the same error report latency when you try to telnet a host which is off-line temporary. Also, I still don't see the relevance between naming service and the discussed problem, namly report OBJECT_NOT_EXISTS or COMM_FAILURE after 40 seconds on sending a request to an object on an off-line (or powered off) host. Regards, Ke > Apart from the > quite horribly botched IDL design, the service is pragmatically not > very useful. At most, I'd use it to locate a handful of key IORs that > clients need to get off the ground. For everything else, it's better > to build the functionality into the server itself, especially when > the set of IORs that clients need to look up is not stable and changes > all the time. > > Cheers, > > Michi. |
|
#10
| |||
| |||
| Michi Henning wrote: > IMO, overall, the naming service is a pretty bad idea. Apart from the > quite horribly botched IDL design, the service is pragmatically not > very useful. At most, I'd use it to locate a handful of key IORs that > clients need to get off the ground. For everything else, it's better > to build the functionality into the server itself, especially when > the set of IORs that clients need to look up is not stable and changes > all the time. Once approach that has the advantage of interoperating with clients expecting to use the naming service is to embed an implementation of the naming service in your server. Then you can guarantee that the object references exported by the naming service don't get out of sync with the actual object lifetimes. Federate your custom naming service implementation with the generic one and voila! -- Jonathan Biggar jon@floorboard.com jon@biggar.org |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.