Weblogic 10.3.0 在 AIX6.1、JDK1.6 下挂起解决方法
目录(?)[+]
上周在AIX6.1下安装weblogic10.3.0,并配置了hacmp集群环境,但是接下来的几天遇到了挂起问题,为此还加班了一天。
现象描述:
Weblogic启动后,10到30分钟就会hang住,应用和管理控制台都无法访问。强制kill -9 pid后端口无法释放,使用rmsock 命令查看端口显示Wait for exiting processes to be cleaned up before removing the socket。
分析及处理过程
1. 用ps –ef | grep java找到weblogic进程,每隔三分种执行kill -3 pid,在domain目录下生成javacore文件
2. 分析weblogic日志,发现如下内容
<Aug 21, 2009 4:33:37 AM CDT> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: ‘1′ for queue: ‘weblogic.kernel.Default (self-tuning)’ has been busy for “620″ seconds working on the request
“weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl@20de20de”, which is more than the configured time (StuckThreadMaxTime) of “600″ seconds. Stack trace:
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:103)
……
<Aug 21, 2009 4:34:37 AM CDT> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: ‘1′ for queue: ‘weblogic.kernel.Default (self-tuning)’ has been busy for “680″ seconds working on the request
“weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl@20de20de”, which is more than the configured time (StuckThreadMaxTime) of “600″ seconds. Stack trace:
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:103)
……
3. 用IBM Thread and Monitor Dump Analyzer for java分析刚才生成的thread dump,找到如下两个线程信息:
3XMTHREADINFO “[ACTIVE] ExecuteThread: ‘5′ for queue: ‘weblogic.kernel.Default (self-tuning)’” TID:0×39CBED00, j9thread_t:0×3751C83C, state:R, prio=5
3XMTHREADINFO1 (native thread ID:0xCE1DB, native priority:0×5, native policy:UNKNOWN)
4XESTACKTRACE at java/net/PlainSocketImpl.socketClose0(Native Method)
4XESTACKTRACE at java/net/PlainSocketImpl.socketPreClose(PlainSocketImpl.java:706)
4XESTACKTRACE at java/net/PlainSocketImpl.close(PlainSocketImpl.java:540)
4XESTACKTRACE at java/net/SocksSocketImpl.close(SocksSocketImpl.java:1041)
4XESTACKTRACE at java/net/Socket.close(Socket.java:1343)
4XESTACKTRACE at weblogic/socket/SocketMuxer.closeSocket(SocketMuxer.java:475)
4XESTACKTRACE at weblogic/socket/SocketMuxer.cancelIo(SocketMuxer.java:813)
4XESTACKTRACE at weblogic/socket/SocketMuxer$TimerListenerImpl.timerExpired(SocketMuxer.java:1021(Compiled Code))
4XESTACKTRACE at weblogic/timers/internal/TimerImpl.run(TimerImpl.java:273(Compiled Code))
4XESTACKTRACE at weblogic/work/SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:516(Compiled Code))
4XESTACKTRACE at weblogic/work/ExecuteThread.execute(ExecuteThread.java:201(Compiled Code))
4XESTACKTRACE at weblogic/work/ExecuteThread.run(ExecuteThread.java:173)
3XMTHREADINFO “ExecuteThread: ‘7′ for queue: ‘weblogic.socket.Muxer’” TID:0×35381D00, j9thread_t:0×35385864, state:R, prio=5
3XMTHREADINFO1 (native thread ID:0xB916F, native priority:0×5, native policy:UNKNOWN)
4XESTACKTRACE at weblogic/socket/PosixSocketMuxer.poll(Native Method)
4XESTACKTRACE at weblogic/socket/PosixSocketMuxer.processSockets(PosixSocketMuxer.java:102(Compiled Code))
4XESTACKTRACE at weblogic/socket/SocketReaderRequest.run(SocketReaderRequest.java:29)
4XESTACKTRACE at weblogic/socket/SocketReaderRequest.execute(SocketReaderRequest.java:42)
4XESTACKTRACE at weblogic/kernel/ExecuteThread.execute(ExecuteThread.java:145)
4XESTACKTRACE at weblogic/kernel/ExecuteThread.run(ExecuteThread.java:117)
4. 执行线程只有这两个是running状态,一个做CLOSE(),一个做POLL()。别的都是blocked或者wait状态。
5. 经过metalink查询以及和800支持人员确认,这是Weblogic在AIX的JVM上由来已久的bug,从8.1.4就开始在不同版本间出现。原因是IBM的JVM底层socket实现和weblogic配合问题,需要打patch CR370915_1030GA.jar解决。
操作过程
1.在weblogic的启动脚本中,找到CLASSPATH一行
2.在CLASSPATH变量的第一位添加补丁jar包
Eg: CLASSPATH=”${CLASSPATH}${CLASSPATHSEP}${MEDREC_WEBLOGIC_CLASSPATH}”
—>
CLASSPATH=/路径/CR370915_1030GA.jar:”${CLASSPATH}${CLASSPATHSEP}${MEDREC_WEBLOGIC_CLASSPATH}”
3.以上操作仅对这个domain起作用,为了对所有domain起作用,可以添加到common/bin/的目录中的commEnv.sh文件中WEBLOGIC_CLASSPATH=最前面
总结
这个bug在weblgoic和IBM的JVM相组合的平台上出现较为普遍,如果出现相关日志信息,基本可以断定需要打CR370915补丁。
更新:我这里的补丁仅仅 for weblogic 10.3.0.0,其它版本的可以自行用Smart Update下载
Patches for WLS 8.x can be found in My Oracle Support. Open the Patches & Updates tab. Search for patch ID 8173442 for the patches for WLS 8.1mp3, 8.1mp4, and 8.1mp5. Search for patch ID 8179792 for the patch for WLS 8.1mp6.
Patches for WLS 9.x and higher can be downloaded from Smart Update using these patch IDs and passcodes:
——————————————
PATCH REPOSITORY INFORMATION
——————————————
WLS Version | Patch ID | Passcode
————+———-+—————-
9.2 | T4DV | 7C7PYV9B
9.2mp1 | HZHQ | PTUYCCSI
9.2mp2 | WJD2 | GU1CW2AB
9.2mp3 | GNLT | 8J9L6Q4Y
10.0 | PMAJ | 9UQ69LLT
10.0mp1 | ITVL | K8RBHQQ2
10.3 | 9YT5 | I1DB5QSV
如果生产机无法联网,可以
- Download the patch using SmartUpdate on another machine with Internet access.
- Copy the files (for example E5W8.jar and WGQJ.jar) and patch-catalog.xml from your machine with Internet access to the offline machine. For example, say you have a test environment running on a Windows box. Your production environment is running on UNIX. You might copy the jar files from %BEA_HOME%/utils/bsu/cache-dir to $BEA_HOME/utils/bsu/cache-dir.
- When a machine connects to Smart Update, the catalog of patches is always updated automatically. Thus, when a patch is being copied to an offline machine, the patch-catalog.xml file must also be copied over.
- Run SmartUpdate in offline mode and apply patches and patch sets. This can be done using the SmartUpdate command-line interface (see http://download.oracle.com/docs/cd/E14759_01/doc.32/e14143/commands.htm#i1074489).
- This is the syntax for the command to install a patch:.
- You can apply the patch to the offline system manually by extracting the actual patch and adding it to the classpath on the offline system:Extract the actual patch jar file. If you downloaded the patch using SmartUpdate, it will be in the form <patch_id>.jar (for example: E5W8.jar). Inside this jar file is the actual patch jar file, which will be of the form CR326566_92mp3.jar. Extract the latter file for the following steps.
- Add the extracted jar file as the first element of the classpath of the Admin server as well as the managed servers in the domain.
- If you are starting servers using the WebLogic startup script, update the classpath in the startup script like this:set CLASSPATH=<PATCH_DIR>/jars/CR326566_92mp3.jar;%CLASSPATH% (Windows)CLASSPATH=<PATCH_DIR>/jars/CR326566_92mp3.jar:$CLASSPATH (UNIX)where PATCH_DIR is the directory on your local machine where you extracted/saved the patch file.
- Similarly, if you are starting servers using Node Manager, add the patch jar to the beginning of the Class Path argument in the Server Start tab for the server(s).
我一般用第二种,对于单个补丁快捷方便,SmartUpdate可以单独安装,但是会让你选择应用到哪个BEA的主目录,不同的版本和平台能下的补丁不一样。在Windows平台上当然没有AIX的BEA版本,不过只要自己建个目录,然后拷贝一份register.xml进去就可以了。
原文地址:http://blog.csdn.net/davidhsing/article/details/5854600