DiskStore on 1.6-beta4 prevents recovery from dirty shutdown

Type: Bug
Status: Closed
Priority:
Resolution: Fixed
Component/s:
Labels:

Assignee: drb
Reporter: sourceforgetracker
Created: September 21, 2009
Votes: 0
Watchers: 0
Updated:
September 22, 2009
Resolved:
September 22, 2009

Description

We are testing 1.6.0-beta4 before rolling out on our production systems and we found a subtle bug caused mainly by dirty shutdowns which prevents EhCache starting up again and cleaning up dirty index files.

This bug is new to the 1.6.0 branch, since 1.5.0 worked fine in this respect, and after reviewing the source code we found the culprit.

In our case, what’s happing after dirty shutdowns is that the index/data files remain in the file system, but completely empty (0 bytes). EhCache 1.5.0 properly handled pre-existing empty/corrupted files, by discarding them and creating new ones. EhCache 1.6.0 fails to do this and fails with the error “Index file XXX could not created.”.

The reason lies in the implementation of DiskStore.readIndex(), which creates an input stream from the index file and attempts to read some objects. When the file is empty (as it happened in our case), EOFException is thrown and catched by the try-catch block for IOException. This block attempts to fix the problem by invoking createNewIndexFile(), however this fails because it attempts to delete the file and then create a new empty file for the same location/filename. However, as the input stream was never closed, indexFile.delete() fails silently (it returns false, but does not throw an exception - this is a known Java API shortcoming), and therefore re-creating the file will fail as well.

The solution is simple, just close the stream before attempting recovery. The DiskStore code for 1.5.0 was slightly different and did not suffer from this problem. Please find attached the patched file with the proposed solution.

I hope this makes it to 1.6.0 final - it’s a small bug but it prevents apps from restarting gracefully after dirty shutdowns. Sourceforge Ticket ID: 2750578 - Opened By: mads1980 - 10 Apr 2009 16:11 UTC

Comments

Fiona OShea 2009-09-22

Re-opening so that I can properly close out these issues and have correct Resolution status in Jira