• Bug
  • Status: Open
  • 2 Major
  • Resolution:
  • DSO:L1,Integration Modules,Lock Manager
  • eng group
  • Reporter: romswo
  • June 02, 2010
  • 0
  • Watchers: 1
  • November 02, 2010

Attachments

Description

h5. Environment: * Terracotta 3.2.1_2 (with tim-tomcat-6.0 2.1.3 and tim-annotations 1.5.2) * Tomcat 6.0.20 * JDK 1.6.0_17 * Windows XP / Windows 7

h5. Issue: When Tomcat is shutting down (clean shutdown triggered by Ctrl+C or shutdown.bat script) and a servlet defined in web.xml tries to access DSO from its {{destroy()}} method, a deadlock occurs and Tomcat (and whole JVM) never shuts down. I believe Terracotta is trying to acquire lock on the DSO however ClientLockManagerImpl is already in state other then RUNNING (however I cannot confirm that), therefore it waits infinitely. Dump from the catalina shutdown thread:

"Thread-19" Id=57 WAITING on com.tcclient.util.concurrent.locks.ConditionObject$SyncCondition@13946b0
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at com.tc.object.bytecode.ManagerImpl.wait(ManagerImpl.java:822)
	at com.tc.object.bytecode.ManagerUtil.objectWait(ManagerUtil.java:508)
	at com.tcclient.util.concurrent.locks.ConditionObject.await(ConditionObject.java:103)
	at com.tc.object.locks.ClientLockManagerImpl.waitUntilRunning(ClientLockManagerImpl.java:585)
	at com.tc.object.locks.ClientLockManagerImpl.lock(ClientLockManagerImpl.java:90)
	at com.tc.object.bytecode.ManagerImpl.lock(ManagerImpl.java:728)
	at com.tc.object.bytecode.ManagerUtil.monitorEnter(ManagerUtil.java:547)
	at java.util.concurrent.locks.ReentrantReadWriteLock$DsoLock.lock(ReentrantReadWriteLock.java:44)
	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java)
	at java.util.concurrent.ConcurrentHashMap$Segment.__tc_readLock(ConcurrentHashMap.java)
	at java.util.concurrent.ConcurrentHashMap$Segment.get(ConcurrentHashMap.java:281)
	at java.util.concurrent.ConcurrentHashMap.get(Unknown Source)
	at TestServlet.destroy(TestServlet.java:18)
	at org.apache.catalina.core.StandardWrapper.unload(StandardWrapper.java:1393)
	- locked <0x7f43b2> (a org.apache.catalina.core.StandardWrapper)
	at org.apache.catalina.core.StandardWrapper.stop(StandardWrapper.java:1738)
	at org.apache.catalina.core.StandardContext.stop(StandardContext.java:4509)
	- locked <0x1cd3823> (a org.apache.catalina.core.StandardContext)
	at org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:924)
	at org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1191)
	at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:1162)
	at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:313)
	at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
	at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1086)
	- locked <0x13571f0> (a org.apache.catalina.core.StandardHost)
	at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1098)
	- locked <0x1cb704a> (a org.apache.catalina.core.StandardEngine)
	at org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:448)
	at org.apache.catalina.core.StandardService.stop(StandardService.java:584)
	- locked <0x1cb704a> (a org.apache.catalina.core.StandardEngine)
	at org.apache.catalina.core.StandardServer.stop(StandardServer.java:744)
	at org.apache.catalina.startup.Catalina.stop(Catalina.java:628)
	at org.apache.catalina.startup.Catalina$CatalinaShutdownHook.run(Catalina.java:671)

This problem was originally found in a BlazeDS web application. There is a MessageBrokerServlet class which cleans up things on {destroy()}. I had custom extension for this class to clean up shared DSO. However the problem itself is not related to BlazeDS in any way. Attached is sample web application with one servlet that inserts data to DSO map on init and tries to read this data on destroy. Attached application does not use BlazeDS.

In the original (BlazeDS) application the problem was present only when there were some data in the shared DSO maps. I.e. if I removed all the entries from the shared maps before closing the application, Tomcat shut down correctly. If there were some entries left - then the deadlock occured.

What is interesting - with the attached sample application the problem is always present. Even if there are no entries in shared map, the deadlock on shutdown occurs.

I found a similar issue (CDV-1376) but related to JBoss and RemoteTransactionMgr. However the effect is exactly the same for my issue.

Comments

Roman Swoszowski 2010-06-02

Update - this is not true: “In the original (BlazeDS) application the problem was present only when there were some data in the shared DSO maps”.

The problem is ALWAYS present (both if there are entries in the DSO shared maps and if the DSO maps are empty).

Fiona OShea 2010-06-08

Please take a quick look

Tim Eck 2010-06-08

Unfortunately there is no way to gaurentee the order of VM shutdown hooks. The core terracotta client needs to use a shutdown hook to make sure any and all object changes are flushed to the terracotta server before the VM exits.

The only workaround for this issue at the moment is to read/write shared objects from within a shutdown hook.

Tim Eck 2010-06-08

I’m not sure when/if we would ever decide this to do this, but there are 2 things I could see that we could do to make interations with shutdown hooks better:

Instead of actually shutting down the L1 in the shutdown hook we instead flush the buffer and put the entire L1 into some sort of “synch write” mode (ie. all transactions get synch-write semantics applied to them). That will allow other clustered operations to work and still make sure everything is flushed before VM exit

The other idea is for custom mode only and it is to make sure we’re always the last shutdown hook to run. Since this is custom mode only and dependent on VM internals it is really a non-starter.

Roman Swoszowski 2010-06-10

I tried this but as you said there is no way to guarentee the order of VM shutdown hooks. Additionally shutdown hooks are executed in parallel and not sequentially. So even if I register my own shutdown hook I have no guarantee that my logic will execute before Terracotta client starts cleaning up its components.

Is there any other way to clear my DSO maps when the application shuts down?