TLS SIP transport generating SSL errors
restamp:
This is a long-shot, but given that it involves an OBi202 I will ask the question here. My 202 (5757EX firmware) is configured with two SIP connections to the same Asterisk server. Both 202 and A* are on the same LAN. TLS is used as the SIP transport for both connections, one on SP1 and the other on SP3. As far as I can discern, both are configured identically. Recently I noticed that one connection (SP1) was taking occasional SSL errors:
Code:
[2017-12-05 09:13:32] WARNING[1594] pjproject: SSL SSL_ERROR_SSL (Read): Level: 0 err: <336151548> <SSL routines-SSL3_READ_BYTES-sslv3 alert bad record mac> len: 32000
[2017-12-05 09:13:32] VERBOSE[1600] res_pjsip/pjsip_configuration.c: Contact 805/sips:805@192.168.1.202:4357;transport=TLS has been deleted
[2017-12-05 09:13:32] VERBOSE[1594] res_pjsip_registrar.c: Removed contact 'sips:805@192.168.1.202:4357;transport=TLS' from AOR '805' due to transport shutdown
[2017-12-05 09:13:32] VERBOSE[1600] res_pjsip/pjsip_configuration.c: Endpoint 805 is now Unreachable
[2017-12-05 09:14:02] VERBOSE[1600] res_pjsip/pjsip_configuration.c: Contact 805/sips:805@192.168.1.202:4107;transport=TLS has been created
[2017-12-05 09:14:02] VERBOSE[11518] res_pjsip_registrar.c: Added contact 'sips:805@192.168.1.202:4107;transport=TLS' to AOR '805' with expiration of 60 seconds
[2017-12-05 09:14:02] VERBOSE[1600] res_pjsip/pjsip_configuration.c: Endpoint 805 is now Reachable
[2017-12-05 09:14:02] VERBOSE[1600] res_pjsip/pjsip_configuration.c: Contact 805/sips:805@192.168.1.202:4107;transport=TLS is now Reachable. RTT: 30.925 msec
These events seem random and average perhaps a dozen per day. Except for the 30 seconds it takes for the OBi to re-establish the transport, the line remains usable. Googling this error points to either data corruption or a bug in the SSL implementation at one end or the other as the cause.
But here's the kicker: The *other* SIP connection between the same 202 and A* server (on SP3) is rock solid and never glitches.
Any ideas?
ProfTech:
Just two cents worth. Does the 200/202 have the same default setting of 10/half ethernet as the 100? In the 100 you have to hard set it to 100/full. Also, long shot but I had a voice mail system once that was taking similar errors and it was a crappy hand-made ethernet cable that was causing it.
*edited* I think someone reported that 5757 was recalled with no explanation. Disregard
restamp:
Thanks for the suggestions. I changed the cables, but it didn't change anything. The original ethernet cables are good quality cables, or were for their day, although they probably pre-date the 5E standard. (But, then, if the 202 is only running at 10Mbit that should be adequate, right?) The 202 is hanging off of a Gbit switch, but the run to the A* server is 100Mbit, so there is a bit of speed shifting going on. However, I don't think that is the source of the problem. Bottom line: I can always resolve the problem by backing off of using TLS -- it is really unnecessary in this environment -- but I'd like to understand what's going on. I only wish I had noticed this before I upgraded the Asterisk package; if it wasn't so hectic a time of the year, I might roll back to the older Asterisk just to see if the problem persists or goes away.
Addendum: After re-reading my own post, it dawned on me that I still had some old A* logs from before the upgrade, and a simple grep reveals no instances of "SSL_ERROR" in the logs prior to the upgrade. Thus, I must conclude that either A* 13.18.2 has a problem or I somehow didn't build it right. Still it's odd that of the three SIP connections on this 202 only the one appears to have been affected.
restamp:
Me thinks I found the problem: From the release notes from Asterisk 13.18.3, the point release following the one I am running:
Code:
2017-07-27 06:35 +0000 [0d58fefa30] George Joseph <gjoseph@digium.com>
* bundled_pjproject: Improve SSL/TLS error handling
OpenSSL has 2 levels or error processing. It's possible for the
top layer to return SSL_ERROR_SYSCALL but the lower layer return
no error, in which case processing should continue. Only the top
layer was being examined though so connections were being torn
down when they didn't need to be. This patch adds the examination
of the lower level codes, and if they return no errors, allows
processing to continue.
ASTERISK-27001
Reported-by: Ian Gilmour
patches:
pjproject-2.6.patch submitted by Ian Gilmour (license 6889)
Updated-by: George Joseph and Sauw Ming (Teluu)
Merged to upstream pjproject on 7/27/2017 (commit 5631)
Change-Id: I23844ca0c68ef1ee550f14d46f6dae57d33b7bd2
Now to find the time to do yet another upgrade...
ProfTech:
Good stuff. I recently upgraded to 15.1.2 to be sure I had the RTP bug fix. Can you give any info on what it takes to enable TLS in A* with pjsip? I've been kicking around trying to enable it but didn't know where to start.
Navigation
[0] Message Index
[#] Next page