日期:2014-05-16  浏览次数:20439 次

IP冲突导致的ORA-12170
IP冲突导致的ORA-12170
开发反馈应用程序连不上数据库,并且应用日志中报错:
2014-03-24 09:56:03,095 [WARN] ==== http-bio-8010-exec-3: org.hibernate.util.JDBCExceptionReporter.logExceptions(77)
        SQL Error: 20, SQLState: 61000

2014-03-24 09:56:03,096 [ERROR] ==== http-bio-8010-exec-3: org.hibernate.util.JDBCExceptionReporter.logExceptions(78)
        The Network Adapter could not establish the connection
DB环境为RAC环境,登陆上两台服务器分别检查,发现RAC上的资源都显示为正常状态,包括instance,监听等,并且从服务器端sqlplus / as sysdba连进去坐了些switch logfile之类的操作,都没有问题。
于是尝试从我的个人电脑上Plsql连接DB,半天不响应,之后报了:ORA-12170,TNS:Connect timeout occurred。看看sqlnet.log的信息:
***********************************************************************
Fatal NI connect error 12170.

  VERSION INFORMATION:
 TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
 Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version 10.2.0.1.0 - Production
  Time: 24-3月 -2014 09:43:08
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535
    TNS-12535: TNS: 操作超时
    ns secondary err code: 12560
    nt main err code: 505
    TNS-00505: 操作超时
    nt secondary err code: 60
    nt OS err code: 0
  Client address: <unknown>


***********************************************************************
各个错误代码如下:
[oracle@bssdb02 ~]$ oerr ora 12170
12170, 00000, "TNS:Connect timeout occurred"
// *Cause:  The server shut down because connection establishment or
// communication with a client failed to complete within the allotted time
// interval. This may be a result of network or system delays; or this may
// indicate that a malicious client is trying to cause a Denial of Service
// attack on the server.
// *Action: If the error occurred because of a slow network or system,
// reconfigure one or all of the parameters SQLNET.INBOUND_CONNECT_TIMEOUT,
// SQLNET.SEND_TIMEOUT, SQLNET.RECV_TIMEOUT in sqlnet.ora to larger values.
// If a malicious client is suspected, use the address in sqlnet.log to
// identify the source and restrict access. Note that logged addresses may
// not be reliable as they can be forged (e.g. in TCP/IP).
[oracle@bssdb02 ~]$ oerr ora 12535
12535, 00000, "TNS:operation timed out"
// *Cause: The requested operation could not be completed within the time out
// period.
// *Action: Look at the documentation on the secondary errors for possible
// remedy. See SQLNET.LOG to find secondary error if not provided explicitly.
// Turn on tracing to gather more information.
[oracle@bssdb02 ~]$ oerr ora 12560
12560, 00000, "TNS:protocol adapter error"
// *Cause: A generic protocol adapter error occurred.
// *Action: Check addresses used for proper protocol specification. Before
// reporting this error, look at the error stack and check for lower level
// transport errors. For further details, turn on tracing and reexecute the
// operation. Turn off tracing when the operation is complete.
[oracle@bssdb02 ~]$

很诡异,这种现象一般是服务名或者连接串配置有误,但同时多个用户突然出现这种错误,应该可以排除这方面的原因,防火墙都是关闭状态,这方面影响因素也可以排除,于是做了level 16的客户端的tns跟踪:
trace_level_client = 16
trace_file_client = client
trace_directory_client = d:\tmp

从trace内容来看,应该是在NODE1将用户连接转发到NODE2之后,用户连接在连接NODE2的时候出现问题导致的。
[24-3月 -2014 09:40:59:687] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.0.23.135)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=RAC)(FAILOVER_MODE=(TYPE=SESSION)(METHOD=BASIC))(CID=(PROGRAM=C:\oracle\product\10.2.0\db_1\bin\sqlplus.exe)(HOST=BPIT-10003657)(USER=pengmd))))
[24-3月 -2014 09:40:59:687] nttbnd2addr: entry
[24-3月 -2014 09:40:59:687] snl