When RMI replication is used, the statically set remoteObjectPort can’t be rebound on application restart

Type: Bug
Status: Open
Priority: 2 Major
Resolution:
Component/s: ehcache-core
Labels:

Assignee:
Reporter: hostalp
Created: October 13, 2009
Votes: 1
Watchers: 1
Updated:
October 11, 2011
Resolved:

Attachments

gwi (12.00 k) application/zip ehcache_EHC-422.zip

Description

Hello,

we use Ehcache as 2nd level Hibernate cache only, having RMI replication enabled with all the following ports statically set:

net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory multicastGroupPort
net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory port remoteObjectPort

However the last one (remoteObjectPort) causes troubles when application is restarted without restarting the whole application server.

I noticed that when ehcache starts it creates sockets for all those configured ports and when it’s stopped but the application server left running all these sockets remain open.

When it’s started again then, with the first two mentioned ports there appears to be no problem, but for remoteObjectPort we get the followin exception:

Caused by: javax.persistence.PersistenceException: org.hibernate.HibernateException: Could not instantiate cache implementation at org.hibernate.ejb.Ejb3Configuration.buildEntityManagerFactory(Ejb3Configuration.java:737) at org.hibernate.ejb.HibernatePersistence.createContainerEntityManagerFactory(HibernatePersistence.java:127) at org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean.createNativeEntityManagerFactory(LocalContainerEntityManagerFactoryBean.java:224) at org.springframework.orm.jpa.AbstractEntityManagerFactoryBean.afterPropertiesSet(AbstractEntityManagerFactoryBean.java:291) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1369) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1335) … 143 more Caused by: org.hibernate.HibernateException: Could not instantiate cache implementation at org.hibernate.cache.CacheFactory.createCache(CacheFactory.java:64) at org.hibernate.impl.SessionFactoryImpl.(SessionFactoryImpl.java:214) at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1300) at org.hibernate.cfg.AnnotationConfiguration.buildSessionFactory(AnnotationConfiguration.java:915) at org.hibernate.ejb.Ejb3Configuration.buildEntityManagerFactory(Ejb3Configuration.java:730) ... 148 more Caused by: org.hibernate.cache.CacheException: net.sf.ehcache.CacheException: Problem starting listener for RMICachePeer null. Initial cause was Port already in use: 41103; nested exception is: java.net.BindException: Address already in use at net.sf.ehcache.hibernate.EhCacheProvider.buildCache(EhCacheProvider.java:96) at org.hibernate.cache.CacheFactory.createCache(CacheFactory.java:61) ... 152 more Caused by: net.sf.ehcache.CacheException: Problem starting listener for RMICachePeer null. Initial cause was Port already in use: 41103; nested exception is: java.net.BindException: Address already in use at net.sf.ehcache.distribution.RMICacheManagerPeerListener.notifyCacheAdded(RMICacheManagerPeerListener.java:529) at net.sf.ehcache.event.CacheManagerEventListenerRegistry.notifyCacheAdded(CacheManagerEventListenerRegistry.java:157) at net.sf.ehcache.CacheManager.addCacheNoCheck(CacheManager.java:708) at net.sf.ehcache.CacheManager.addCache(CacheManager.java:685) at net.sf.ehcache.CacheManager.addCache(CacheManager.java:643) at net.sf.ehcache.hibernate.EhCacheProvider.buildCache(EhCacheProvider.java:90) ... 153 more Caused by: java.rmi.server.ExportException: Port already in use: 41103; nested exception is: java.net.BindException: Address already in use at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:284) at sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:219) at sun.rmi.transport.tcp.TCPEndpoint.exportObject(TCPEndpoint.java:398) at sun.rmi.transport.LiveRef.exportObject(LiveRef.java:131) at sun.rmi.server.UnicastServerRef.exportObject(UnicastServerRef.java:195) at java.rmi.server.UnicastRemoteObject.exportObject(UnicastRemoteObject.java:306) at java.rmi.server.UnicastRemoteObject.exportObject(UnicastRemoteObject.java:269) at java.rmi.server.UnicastRemoteObject.(UnicastRemoteObject.java:168) at net.sf.ehcache.distribution.RMICachePeer.(RMICachePeer.java:65) at net.sf.ehcache.distribution.RMICacheManagerPeerListener.notifyCacheAdded(RMICacheManagerPeerListener.java:525) ... 158 more Caused by: java.net.BindException: Address already in use at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:417) at java.net.ServerSocket.bind(ServerSocket.java:342) at java.net.ServerSocket.(ServerSocket.java:208) at java.net.ServerSocket.(ServerSocket.java:120) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createServerSocket(RMIDirectSocketFactory.java:46) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createServerSocket(RMIMasterSocketFactory.java:350) at sun.rmi.transport.tcp.TCPEndpoint.newServerSocket(TCPEndpoint.java:638) at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:272) ... 167 more

So it tried to create the same socket again and thus fails.

Simple workaround is to not specify the static value for remoteObjectPort so that it another socket is open, however this could lead to many sockets left open after series ot application restarts and also disallows the use of static port in there.

Comments

gluck 2009-12-02

Are you calling CacheManager.shutdown() before starting again?

Petr H 2009-12-03

We completely rely on hibernate (in particular Hibernate 3.2.6) here, not taking care of anything regarding Hibernate caches from our code at all. Hibernate’s configured with hibernate.cache.provider_class set to net.sf.ehcache.hibernate.EhCacheProvider

Tested on WebSphere AS 6.1.0.25:

server with application started
application stopped, server left running - all three ports defined in ehcache.xml (see above) remain open in LISTEN state
application started back - java.rmi.server.ExportException: Port already in use - for the remoteObjectPort

gluck 2009-12-03

When you say you “stop” an application do you mean that you undeploy it or do something else in the AppServer that would trigger a call to contextDestroyed in any registered ServletContextListeners. I assume that is what you mean.

Hibernate’s SessionFactory.close() calls the EhCacheProvider.stoip() which calls CacheManager.shutdown(). So, you need a registered ServletContextListener which will call SessionFactory.close() in its contextDestroyed method?

Petr H 2009-12-03

As for the application stop in WAS, it destroys all registered context listeners and besides that perhaps pefrorms some other operations.

Basically we’ve got the hibernate’s session factory hidden behind entity manager (org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean) so we don’t use it directly. Perhaps something’s missing on our side which prevents the close call on it on destroy. With some debug logging on hibernate it really seems like it doesn’t happen.

gluck 2009-12-03

Ok, then add a ServletContextListener and grab the reference to SessionFactory from Spring and call close.

Petr H 2009-12-08

Just a quick update to let you know that we didn’t dismiss this. We noticed some differences in behaviour on Tomcat vs. WebSphere where on Tomcat the hibernate SessionFactory is closed while on WAS it’s not (on the same application of course). We’ll try to figure out why.

Petr H 2009-12-08

Alright, I think we’ve got the WAS difference issue - it was pretty simple though - it was only that some other webmodule from the app stopped before the critical one and took down the log4j logging so it wasn’t just logged, that’s all.

So when stopping the app on WAS we get something like:

SystemOut: 16:30:31:818 Destroying Spring FrameworkServlet ‘remoting’ 16:30:31:821 [remoting]: Destroy successful. 16:30:31:846 [cookieSetter]: Destroy successful. 16:30:31:856 Context destroyed (this is our message notifying that context is destroyed) 16:30:31:856 EMF isOpen=true (this is our message notifying that EntityManagerFactory is still open) 16:30:31:856 EMF closing (this is our message notifying that EntityManagerFactory is about to be closed) 16:30:31:916 Closing Spring root WebApplicationContext 16:30:31:920 Shutting down log4j 16:30:32:081 Context destroyed (this is our message notifying that context is destroyed) 16:30:32:082 EMF isOpen=false (this is our message notifying that EntityManagerFactory is closed now)

spring: 16:30:31,820 XmlWebApplicationContext Closing org.springframework.web.context.support.XmlWebApplicationContext … 16:30:31,821 DefaultListableBeanFactory Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory … 16:30:31,856 DefaultListableBeanFactory Returning cached instance of singleton bean ‘entityManagerFactory’ 16:30:31,918 XmlWebApplicationContext Closing org.springframework.web.context.support.XmlWebApplicationContext … 16:30:31,919 DefaultListableBeanFactory Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory … 16:30:31,919 DisposableBeanAdapter Invoking destroy() on bean with name ‘jmxExporter’ (here logging stops as log4j goes down)

hibernate: 16:30:31,857 SessionFactoryImpl closing (here logging stops as log4j goes down)

ehcache: 16:30:31,857 RMICacheManagerPeerListener Removing from RMI listener 16:30:31,859 RMICacheManagerPeerListener 144 RMICachePeers bound in registry for RMI listener … 16:30:31,913 RMICacheManagerPeerListener Removing from RMI listener 16:30:31,914 RMICacheManagerPeerListener 0 RMICachePeers bound in registry for RMI listener 16:30:31,914 MulticastKeepaliveHeartbeatReceiver dispose called 16:30:31,915 RMICacheManagerPeerListener rmiregistry unexported. 16:30:31,915 RMICacheManagerPeerListener 0 RMICachePeers unbound from registry in RMI listener (here logging stops as log4j goes down)

After this the net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory multicastGroupPort is released but net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory port and remoteObjectPort are still left open in LISTEN state.

So I believe the stop procedure performed correctly, but those ports are still left open.

Petr H 2009-12-08

And if it’s any help, here are few pieces from the JVM thread dump after the app is stopped which may be of some interest:

for RMICacheManagerPeerListenerFactory port (40050) 3XMTHREADINFO “RMI TCP Accept-40050” (TID:0x0A6E3200, sys_thread_t:0x0A6E2288, state:R, native ID:0x0000375E) prio=5 4XESTACKTRACE at java/net/PlainSocketImpl.socketAccept(Native Method) 4XESTACKTRACE at java/net/PlainSocketImpl.accept(PlainSocketImpl.java:457) 4XESTACKTRACE at java/net/ServerSocket.implAccept(ServerSocket.java:473) 4XESTACKTRACE at java/net/ServerSocket.accept(ServerSocket.java:444) 4XESTACKTRACE at sun/rmi/transport/tcp/TCPTransport.run(TCPTransport.java:375) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:810)

for RMICacheManagerPeerListenerFactory remoteObjectPort (40110) 3XMTHREADINFO “RMI TCP Accept-40110” (TID:0x0A6A1500, sys_thread_t:0x0A6A3928, state:R, native ID:0x00003762) prio=5 4XESTACKTRACE at java/net/PlainSocketImpl.socketAccept(Native Method) 4XESTACKTRACE at java/net/PlainSocketImpl.accept(PlainSocketImpl.java:457(Compiled Code)) 4XESTACKTRACE at java/net/ServerSocket.implAccept(ServerSocket.java:473(Compiled Code)) 4XESTACKTRACE at java/net/ServerSocket.accept(ServerSocket.java:444) 4XESTACKTRACE at sun/rmi/transport/tcp/TCPTransport.run(TCPTransport.java:375) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:810)

3XMTHREADINFO “RMI RenewClean-[10.2.1.177:40110,net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory@1d4c0]” (TID:0x09F0B700, sys_thread_t:0x0B3D6458, state:CW, native ID:0x00003769) prio=5 4XESTACKTRACE at java/lang/Object.wait(Native Method) 4XESTACKTRACE at java/lang/Object.wait(Object.java:231(Compiled Code)) 4XESTACKTRACE at java/lang/ref/ReferenceQueue.remove(ReferenceQueue.java:102) 4XESTACKTRACE at sun/rmi/transport/DGCClient$EndpointEntry$RenewCleanThread.run(DGCClient.java:516) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:810)

gluck 2009-12-08

Thanks for clarifying. We will take a look at this.

Guido Wischrop 2010-04-15

We have the same problem here. I did some research and I think the reason for this problem is http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4457683

I tried the suggested workaround (http://archives.java.sun.com/cgi-bin/wa?A2=ind0507&L=rmi-users&P=4036), but it did not work as expected. So I made my own workaround by creating a SimpleRMISocketFactory which keeps track of all created ServerSockets and provides a method to close them. I had to modify RMICacheManagerPeerListener, RMICacheManagerPeerListenerFactory and RMICachePeer to use this factory (created PatchedRMICacheManagerPeerListener, Patched…). Then I modified the ehache.xml to use the new PatchedRMICacheManagerPeerListenerFactory.

To close the opened ServerSockets when the application is undeployed I’m using my own ContextListener which is registered in the web.xml. Some sample code for the ContextListener:

public class SimpleContextListener implements ServletContextListener {

[…] public void contextDestroyed(ServletContextEvent sce) { LOG.info(“Context shutdown - setting RMIFailureHandler to end orphaned RMI threads!”); RMIFailureHandler failureHandler = RMISocketFactory.getFailureHandler(); RMISocketFactory.setFailureHandler(new SimpleRMIFailureHandler());

[…]

  try
  {
     Thread.sleep(2000);
  } catch (InterruptedException e)
  {
     //ignore
  }
  LOG.info("Context shutdown - resetting RMIFailureHandler to old value!");
  RMISocketFactory.setFailureHandler(failureHandler);
  LOG.info("Context shutdown - done!"); \}

We have to set our own RMIFailureHandler to end RMI threads which otherwise will retry to open a new ServerSocket once the original socket is closed.

There might be problems with other RMI threads because the RMIFailureHandler is global. It works for us, so I’m providing the solution here.

Any suggestion are welcome.

Guido Wischrop 2010-04-15

the source code for SimpleRMIFailureHandler, SimpleRMISocketFactory and the patched ehcache classes PatchedRMICacheManagerPeerListener, PatchedRMICacheManagerPeerListenerFactory and PatchedRMICachePeer

Guido Wischrop 2010-04-15

I forgot to mention, that this problem seems to be fixed with Java 1.6.

Fiona OShea 2010-09-01

Moving all unresolved “Fix Revision 2.2.1” to fix revision “unknown” as we are releasing Magnum first which is 2.3. Currently not sure which fix version these will actually be in, but they are targeted for Fremantle release

Fiona OShea 2011-02-22

MOving unresolved P2 jiras to Ulloa - to be reviewed by Chris, Fiona, Greg soon