Conversation
…t might be trying to solve an non-existent issue. Chaned WeakReference to SoftReference to make cache more persistent in memory. Addresses issue #1643.
|
@cmnbroad Can you take a look at this? It seems like weak -> soft reference is the right change, and hopefully removing the unsychronized method helps, but it would be good if you could take a look. |
| private byte[] findInCache(final String name) { | ||
| final WeakReference<byte[]> weakReference = cacheW.get(name); | ||
| final SoftReference<byte[]> weakReference = cache.get(name); | ||
| if (weakReference != null) { |
There was a problem hiding this comment.
Can you explicitly check for an empty array here as well, and throw an informative error if you encounter one?
There was a problem hiding this comment.
Under what circumstances the bases array would be an empty array as supposed to a null (claimed by the GC)?
|
I don't have any issue with changing We can't remove the backing reference though, since that was added to address a real performance issue (described in the comments). As I understand it, I agree that the class here contract isn't clear; though, the synchronized does make it look like it's thread safe, but it isn't. I'm not sure why that was originally in there. |
|
Thanks @cmnbroad. So far there is no evidence that Soft for Weak fixes the issue, may well in most cases but as you mentioned is not guarantee in all implementations. I guess that there is two ways to proceed, either we actually make it thread safe adding back the last-requested hard reference or we simply scrap all the thread safety away and enforce the originally intended non-thread safe contract. I think the most pragmatical would be the former; too make it truly thread-safe which is always an improvement on the existing code. But perhaps you see this as an opportunity to "make it right" although with the risk of breaking additional third party dependent code. |
|
@cmnbroad I think the Weak -> Soft reference switch makes sense, it definitely seems like the intent is that WeakReferences are collected eagerly and Soft are kept until there is memory pressure. It's possible that a jvm will implement it badly but I would guess that it's going to do basically what we want on any normal jvm we're likely to run on. I think it's very possible that the switch will fix the performance issue even if the explicit reference is removed. With the Weak references the cache is effectively doing nothing, which explains why holding the last one specifically was necessary. If there is bad memory pressure we could still see thrashing here, but the alternative might be that we hold the reference and get OOM? Probably safest to keep the backing bases if you're worried about it, but I think in any case if we want the cache to work we should swap Weak -> Soft. |
|
If we reinstate the 'backed' strong references would it make sense that these are updated and use regardless whether the invoking code calls get getReferenceSequence or getReferenceSequenceByRegion? The original code would only use and update the backed references in the latter is called. Is the a reason against doing so for both? |
Description
Removed 'backed' last cache hit hard references; it seems that that it might be trying to solve an non-existent issue.
Changed WeakReference to SoftReference to make cache more persistent in memory. Addresses GATK issue #1643.
Things to think about before submitting: