| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| Hi all. Has anyone seen an exception similar to the following, coming from a call to getBlob()? > java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > at java.util.ArrayList.get(ArrayList.java:322) > at org.apache.derby.client.net.NetCursor.findExtdtaDa ta(Unknown Source) > at org.apache.derby.client.net.NetCursor.getBlobColum n_(Unknown Source) > at org.apache.derby.client.am.Cursor.getBlob(Unknown Source) > at org.apache.derby.client.am.ResultSet.getBlob(Unkno wn Source) > at org.apache.derby.client.am.ResultSet.getBlob(Unkno wn Source) This is happening on 10.4.1.3 -- the issue is transient so even if we update go 10.4.2.0 we have no way to know if it has fixed the issue. But I have looked through the list of fixes for 10.4.2.0 and nothing similar appears to be in it. The code triggering the error is relatively simple (after removing resource closing...) UUID uuid; // ... PreparedStatement ps = connection.prepareStatement( "SELECT data FROM binaries WHERE guidhigh = ? AND guid = ?"); ps.setLong(1, uuid.getMostSignificantBits()); ps.setLong(2, uuid.getLeastSignificantBits()); rs = ps.executeQuery(); if (rs.next()) { Blob blob = rs.getBlob("data"); // Take slice of blob } In the mail archives there is a post from someone getting a similar issue but for CLOBs. A response asked them to raise a bug for it but I can't find a bug along these lines. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software |
|
#2
| |||
| |||
| Daniel Noll wrote: > Hi all. > > Has anyone seen an exception similar to the following, coming from a > call to getBlob()? Hi, Yes, I've seen it before, but only with 10.3.2.1 and earlier. The repro I have, which is very similar to the code you posted, doesn't trigger the bug with 10.3.3.0 or newer. Can you double check the versions (both client and server) you are using? For the record, I've been running with JDK 1.6.0. -- Kristian > >> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 >> at java.util.ArrayList.RangeCheck(ArrayList.java:547) >> at java.util.ArrayList.get(ArrayList.java:322) >> at >> org.apache.derby.client.net.NetCursor.findExtdtaDa ta(Unknown Source) >> at >> org.apache.derby.client.net.NetCursor.getBlobColum n_(Unknown Source) >> at org.apache.derby.client.am.Cursor.getBlob(Unknown Source) >> at org.apache.derby.client.am.ResultSet.getBlob(Unkno wn Source) >> at org.apache.derby.client.am.ResultSet.getBlob(Unkno wn Source) > > This is happening on 10.4.1.3 -- the issue is transient so even if we > update go 10.4.2.0 we have no way to know if it has fixed the issue. > But I have looked through the list of fixes for 10.4.2.0 and nothing > similar appears to be in it. > > The code triggering the error is relatively simple (after removing > resource closing...) > > UUID uuid; > // ... > > PreparedStatement ps = connection.prepareStatement( > "SELECT data FROM binaries WHERE guidhigh = ? AND guid = ?"); > ps.setLong(1, uuid.getMostSignificantBits()); > ps.setLong(2, uuid.getLeastSignificantBits()); > rs = ps.executeQuery(); > if (rs.next()) { > Blob blob = rs.getBlob("data"); > > // Take slice of blob > } > > In the mail archives there is a post from someone getting a similar > issue but for CLOBs. A response asked them to raise a bug for it but > I can't find a bug along these lines. > > Daniel > > |
|
#3
| |||
| |||
| Kristian Waagan wrote: > Daniel Noll wrote: >> Hi all. >> >> Has anyone seen an exception similar to the following, coming from a >> call to getBlob()? > > Hi, > > Yes, I've seen it before, but only with 10.3.2.1 and earlier. > The repro I have, which is very similar to the code you posted, doesn't > trigger the bug with 10.3.3.0 or newer. > Can you double check the versions (both client and server) you are using? > > For the record, I've been running with JDK 1.6.0. I can confirm that it's 10.4.1.3 for both client and server. We've been using this version since it came out, and the version of our software in use has been confirmed to be one where this version was included. We're also running on JDK 1.6, though I'm not sure if the version which was being used when the problem occurred was u6 or u10. Was this a fixed bug for which there is a JIRA issue? I was unable to find out, but if one exists, the attached patch would presumably allow me to confirm whether 10.4.1.3 includes the same fix, or perhaps more interestingly, whether 10.4.2.0 does. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software |
|
#4
| |||
| |||
| Daniel Noll wrote: > Kristian Waagan wrote: >> Daniel Noll wrote: >>> Hi all. >>> >>> Has anyone seen an exception similar to the following, coming from a >>> call to getBlob()? >> >> Hi, >> >> Yes, I've seen it before, but only with 10.3.2.1 and earlier. >> The repro I have, which is very similar to the code you posted, >> doesn't trigger the bug with 10.3.3.0 or newer. >> Can you double check the versions (both client and server) you are >> using? >> >> For the record, I've been running with JDK 1.6.0. > > I can confirm that it's 10.4.1.3 for both client and server. We've > been using this version since it came out, and the version of our > software in use has been confirmed to be one where this version was > included. > > We're also running on JDK 1.6, though I'm not sure if the version > which was being used when the problem occurred was u6 or u10. > > Was this a fixed bug for which there is a JIRA issue? I was unable to > find out, but if one exists, the attached patch would presumably allow > me to confirm whether 10.4.1.3 includes the same fix, or perhaps more > interestingly, whether 10.4.2.0 does. The Jira you are looking for might be DERBY-3243. What you report seems to be the same symptom, but at first sight I think the cause is different. With a single execution thread, the server should never return one of the invalid locator values, but I think there's a chance it can happen if more than one thread calls the locator key generation method. It is not yet clear to me how this can happen, and I might be wrong. How hard is it for you to reproduce the error? It should be simple to write a small patch that verifies that the locator values generated by the server are valid. If what I describe is indeed the problem, the bug affects both Blob and Clob. You might want to log a new Jira issue for this bug. -- Kristian > > Daniel > > |
|
#5
| |||
| |||
| [ snip ] > > The Jira you are looking for might be DERBY-3243. > What you report seems to be the same symptom, but at first sight I > think the cause is different. > With a single execution thread, the server should never return one of > the invalid locator values, but I think there's a chance it can happen > if more than one thread calls the locator key generation method. It is > not yet clear to me how this can happen, and I might be wrong. Daniel, I haven't been able to obtain incorrect values, but I've been able to obtain the same locator value twice (on a multiprocessor machine and a slightly hacked Derby). It would help us a lot if we could instrument Derby to log the locator values that cause the error happening in your environment. Also, on what kind of machine and operation system are you observing the error? regards, -- Kristian |
|
#6
| |||
| |||
| Kristian Waagan wrote: > I haven't been able to obtain incorrect values, but I've been able to > obtain the same locator value twice (on a multiprocessor machine and a > slightly hacked Derby). It would help us a lot if we could instrument > Derby to log the locator values that cause the error happening in your > environment. > Also, on what kind of machine and operation system are you observing the > error? Desktop is: Windows 2003 x86, 4 CPUs, 4GB RAM Server is: Windows 2003 x86, 4 CPUs, 4GB RAM Given that the information is so similar between the two, they might be running on the same computer. As for reproducibility, I'm unable to reproduce it here at all, and the end user only had it happen once so far. But when it happened, it apparently continued to happen for all subsequent BLOB retrievals for a while, and then "fixed itself". Each of these BLOB retrievals are being done in a separate transaction, which makes that last claim even more perplexing, unless the client reconnected in the meantime and we simply weren't told. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software |
|
#7
| |||
| |||
| Daniel Noll wrote: > Kristian Waagan wrote: >> I haven't been able to obtain incorrect values, but I've been able to >> obtain the same locator value twice (on a multiprocessor machine and >> a slightly hacked Derby). It would help us a lot if we could >> instrument Derby to log the locator values that cause the error >> happening in your environment. >> Also, on what kind of machine and operation system are you observing >> the error? > > Desktop is: Windows 2003 x86, 4 CPUs, 4GB RAM > Server is: Windows 2003 x86, 4 CPUs, 4GB RAM > > Given that the information is so similar between the two, they might > be running on the same computer. > > As for reproducibility, I'm unable to reproduce it here at all, and > the end user only had it happen once so far. But when it happened, it > apparently continued to happen for all subsequent BLOB retrievals for > a while, and then "fixed itself". > > Each of these BLOB retrievals are being done in a separate > transaction, which makes that last claim even more perplexing, unless > the client reconnected in the meantime and we simply weren't told. Daniel, Thanks for the information. I feel we have too little information to create a fix - we don't even know what the real problem is. The locator values are drawn from a counter, and there is a counter for each (root) connection. I'm having trouble understanding how we could get concurrency issues in this case. Also, I think the error you are seeing suggests an invalid locator value, not a duplicate value. Anything special about your network server setup? (time-slicing, statement caching, connection pooling) My suggestion is to wait for a while and see if it happens again, or see if anyone else has suggestions. regards, -- Kristian > > Daniel > |
|
#8
| |||
| |||
| Kristian Waagan wrote: > I feel we have too little information to create a fix - we don't even > know what the real problem is. > The locator values are drawn from a counter, and there is a counter for > each (root) connection. I'm having trouble understanding how we could > get concurrency issues in this case. > Also, I think the error you are seeing suggests an invalid locator > value, not a duplicate value. > > Anything special about your network server setup? (time-slicing, > statement caching, connection pooling) > > My suggestion is to wait for a while and see if it happens again, or see > if anyone else has suggestions. It has happened again. This time it took 12 hours for it to happen, which is information I didn't previously have. If I'm lucky this will help reproducing it here. Maybe it's something that takes a long time until it occurs. Or maybe it's something where the probability is just really low so it takes an enormous number of attempts before it happens. As far as the network server setup itself, it's straight-forward. We're not using connection pooling due to bugs preventing that from working properly, and everything else is normal as well. I guess I can run a test overnight to see if something similar happens, with tracing turned on. It's going to generate a lot of output though so I somewhat fear for my disk space. :-) Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software |
|
#9
| |||
| |||
| On 11/10/08 01:27, Daniel Noll wrote: > Kristian Waagan wrote: >> I feel we have too little information to create a fix - we don't even >> know what the real problem is. >> The locator values are drawn from a counter, and there is a counter >> for each (root) connection. I'm having trouble understanding how we >> could get concurrency issues in this case. >> Also, I think the error you are seeing suggests an invalid locator >> value, not a duplicate value. >> >> Anything special about your network server setup? (time-slicing, >> statement caching, connection pooling) >> >> My suggestion is to wait for a while and see if it happens again, or >> see if anyone else has suggestions. > > It has happened again. This time it took 12 hours for it to happen, > which is information I didn't previously have. If I'm lucky this will > help reproducing it here. Maybe it's something that takes a long time > until it occurs. Or maybe it's something where the probability is just > really low so it takes an enormous number of attempts before it happens. > > As far as the network server setup itself, it's straight-forward. We're > not using connection pooling due to bugs preventing that from working > properly, and everything else is normal as well. Are the problems you are having with connection pooling logged in Jira? > > I guess I can run a test overnight to see if something similar happens, > with tracing turned on. It's going to generate a lot of output though > so I somewhat fear for my disk space. :-) You can also run the test without logging to see if it can be reproduced by a 12 hour run. If so, I think we have two initial options; a) Synchronize the access to the counter properly b) Add custom logging to the code that fails, to see which value causes the failure. If it is one of the invalid locator values, it's a strong indication that the problem is indeed the counter. The bug I'm thinking of on one with a low probability, so if it happens constantly after ~12 hours it sounds more like an overflow problem of some kind. If you can give me some more details about the data and the load, I might be able to kick of some test runs of my own; - Blob size - number of rows in the table - number of clients accessing the table concurrently - isolation level - page cache size - any other information you think might be relevant -- Kristian > > Daniel > > |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.