EHC ❯ MemoryEfficientByteArrayOutputStream returns oversized arrays
-
Bug
-
Status: Closed
-
0 Showstopper
-
Resolution: Fixed
-
ehcache-core
-
-
cdennis
-
Reporter: cdennis
-
July 19, 2010
-
0
-
Watchers: 1
-
October 19, 2010
-
July 27, 2010
Description
MemoryEfficientByteArrayOutputStream is designed to reduce the overhead due to arraycopy calls during serialization. It does this by both pre-sizing the array to a larger value than normally used by the JDK, and by also not creating a new array on return. This causes problems since the default approach to using this class is via a static method that tracks the returned size of the array and stores the value in a static lastSize variable, which is used to pre-size the array on subsequent calls. The problem with this is that the size used to set the lastSize variable is the size of the returned array which is not copied and not shrunk-to-fit. This means should I serialize a large object into my cache (say 100kb) then the output stream instance will size the array up to 100kb (or bigger) and then return that resized array, whose size is saved to lastSize. Should I subsequently call serialize using a smaller value (1kb for example) then the array is presized to 100kb, and the returned array is not shrunk, hence it returns a 100kb array consisting of mainly nulls. The size set in last size is not however the occupied 1kb size, but the 100kb array length. In effect this means that every subsequent serialize call will produce 100kb arrays regardless.
The fix I would suggest for this is to only return the original array in the case that the array is exactly the right size. If not we make a correctly sized copy and return that instead. This will restore the desired behavior regarding last size, it will prevent the arraycopy in the common case of everything being the same serialized size, and it still prevents numerous resize copies during the serialization as originally desired.