Mystery reboots Obi 110 build 2774

Started by ProfTech, February 13, 2013, 07:39:54 AM

Previous topic - Next topic

ProfTech

#20
I sent a ticket to Obi and they loaded 2776 into my 110. I have a 100 that I manage remotely via the portal and I noticed that the default is Auto Firmware update Off. I turned it On and of course the unit reboots whenever you make a change via the portal. I will check later and see if it updated.

*Edited* The unit did not update when it rebooted. But then I realized that the URL field appears to be blank by default. I think someone else reported that Auto Firmware update does not work on the 100/110. I think they said if set to "Periodically" it causes the unit to reboot every time it runs even though it never updates anything.

CoalMinerRetired

I have to vent on the 3.0.1 (Build: 3741) on my Obi202. 

It's exhibiting an increasingly annoying assortment of bugs that are (mostly) random and not repeatable:

- The time sync takes an hour or two to update after a reboot. This one is not random it is repeatable. Thus I have multiple calls in history starting at 01/01/2010. I even have a few of these "01/01/2010  19:08:05" seeming to indicate it is taking 7 hours after a reboot to get the correct time. Using the same time server in my router works without issue, as it had been on earlier releases on the Obi.

- Out of nowhere today I got an SMS notification via Caller ID at 10:46 AM. Problem is I got that same SMS notification yesterday when the txt was first sent, and the txt message thread is archived in GV, and marked read.  Both SMS messages are in the Obi call history.

- For the last week I sometimes have to forcibly power down for the Obi to get a DHCP address, such as after a config change reboot fails to get an address.  No change in my router whatsoever.

- In the past five working days I have had three calls 'go dead' where the device stops responding in the middle of a call and I have to power cycle to reboot. Then hope the preceding acquire DHCP step works. Or else it takes 10 minutes to get back up and dialed back into a conf. call.

- I'm noticing numerous calls that 'drop' right after the dialed number connects to an auto attendant on the other end.  I call the same number a lot during working hours, and this never happened before. It's either the auto attendant (no one else dialing in to the AA sees it) or my Obi. Call History shows the Call Ended originates by the phone on the Obi.

- On a positive note, I'm not seeing mystery reboots. On a bad note that is likely due to  I have to manually reboot/power cycle once a day, on account of the above, for the last two weeks.


QBZappy

CoalMinerRetired,

I submit nomination for CMR as beta tester!
Owner of the 1st OBi110/100 units in service in Canada & South America. 1st OBi202 on my street. 1st OBi1032 in Montreal.

CoalMinerRetired

This is beyond an annoyance and out of hand right now. 

- Between 5:00 pm and 7:00 pm this evening I had another mystery reboot. Also had a received call which shows in the Call History with a timestamp of "01/02/2010    23:05:35"  which contradicts the working theory that the time after reboot starts as 01/01/2012 12:00:00.  As of 7:00 pm the time is synced and correct again.  The timestamp for the caller id on my desk phone shows 01/02 11:05 pm.

- Forgot to add a prevalent but not repeatable issue to the above list: Can't hangup a call! When on some calls I can't 'hang up', even if I disconnect the phone cord to the handset.  When I do hangup the handset, I can pick up again 10 or 20 seconds later and the call is still connected. The phone port light(s) on unit are flashing clearly indicating the phone is in use. If the call is to a live person, I wait till they hangup, if I connect to some auto attendant, the only way to get out is a forced reboot.  On a few rare occasions this has manifested itself by the device simply hanging in the middle of a call, and again the only option is a power cycle reboot.

- Things like the above make you want to scream and give up on the whole thing!  Which this week I seriously considered doing. The only thing that stopped me was the ability to make calls directly from Gmail in a browser, and the Gmail doesn't have to have GV. 

I'll be curious if anyone else sees any of these, even one, of these issues on an Obi202 with "3.0.1 (Build: 3741)"

Ostracus

Obviously, have you thought about regressing your firmware?

CoalMinerRetired

Quote from: Ostracus on March 07, 2013, 11:58:59 PM
Obviously, have you thought about regressing your firmware?
In the last few days, yes I have. 

However just last week I moved from configuring the device locally to doing so via the ObItalk portal, thinking that would somehow make things better, as in contacting support for assistance.  It was the portal that installed a newer firmware version (newer than is available via ***6).  So I have to decide which is a better option, contacting support or rolling back and doing local configuration.

ProfTech

Had another crash last night with code 94. This time with build 2776. Sent a message to Obi and they responded with "Everything looks Ok. Ignore it"

Shale

I would try an uninterpretable power supply if I were getting those reboots.

CoalMinerRetired

Quote from: Shale on March 13, 2013, 11:12:24 AM
I would try an uninterpretable power supply if I were getting those reboots.
I have to rule that out. I have ten+ other devices (router, switches, cable modem, Obi110, analog phone, laser printer, cordless phone etc.) connected to the same circuit that do not hiccup from power interruptions.

CoalMinerRetired

Quote from: ProfTech on March 13, 2013, 10:57:19 AM
Had another crash last night with code 94. This time with build 2776. Sent a message to Obi and they responded with "Everything looks Ok. Ignore it"
I also contacted Obi support. Their response was enable ***27 100 Mhz duplex (already done) and then to selectively disable SP configurations, and determine if the issue goes away.  The divide and conquer approach.

ProfTech

Hmmm... That is mildly interesting. I have CallCentric on sp1 for 2 years with no problem until now. I have CallWithUs on sp2 but it is set to not register. And Obitalk is enabled. I am only currently testing CallWithUs so could easily disable it but don't have much faith that will change anything since it isn't even registering and I don't have DID with them. It would be nice if they told us what code 94 is actually supposed to mean.

CoalMinerRetired

I'm getting beyond mildly annoyed by this, .... the reboots and the suggestion to try the divide and conquer approach).   

I was away all of last week, and detected one reboot, and one unknown instance of I can't determine what happened because the status via ObiTalk portal was mixed up.  Then after five days of uninterrupted uptime (and notable with zero calls placed and answered), and then one short call received this AM, I got a reboot an hour or two after the received call. So I guess I start the selectively disable two of four SPs and see what happens. 

CoalMinerRetired

#32
A 12-day update.

Using the divide and conquer approach, I have two (of four) SPs configurations disabled. Both are for Callcentric (with the CNAM with GV arrangement), and two GV SPs enabled.  

Uptime has been 12 days so far, that exceeds any prior uptime duration (on this firmware release), so the obvious culprits are the two CallCentric SP services.

The devices behavior has not been 100% perfect these 12 days, though. Today I had an active call change into dead air. This is about the fifth or sixth time in as many days it happened.  I'll be on a call, anywhere from 60 seconds to 15 minutes into the call the other party and I stop hearing each other.

There is no clicking on the line nor is there a hangup. In fact, I cannot disconnect the call, placing the handset down on the hook has no effect, the Obi ignores this signal and maintains the call still in progress, ... I cannot disconnect via the hook, no matter how many times I press the hook.  All the while I see the Phone 2 phone port light flashing, indicating the "The phone is in use," and at the same time the Obi WebPage is un responsive, therefore I cannot do a "Remove" under the Status > Call Status to end the call.  As a side note and curiosity, I see that "Remove" records the call as "Call Ended" by both parties, both columns, in Call History.

I chose not to reboot, and to wait it out and see for how long the Obi took to become responsive again.  In today's case it was 60 to 90 seconds until the call hung up and I could place call again. Curiously, the Call History showed the call was ended by my phone, not by the other end. 

And to add to the symptoms, today when I placed a follow up call (back into the ongoing multi-party conference call), there was an enormous amount of echo on the entire call, whoever spoke had a bad echo.  After a few minutes I became suspicious I was causing it, I dropped off and dialed in again and the echo was gone. Other parties did the same so we're not sure who caused it, ... but I'm suspicious it was caused by my end.

So my next step is to re-enable the Callcentric SP services, will report back again.

ProfTech

It sounds like from yours and other posts that the 202 firmware has some issues. I can sympathize. I owned my 110 for about a year before the firmware was fairly solid and worked like it was supposed to, and mine was not one of the first units off the line. They were about 6 months into sales when I purchased. FWIW, Obi loaded 2776 into my unit. I disabled CallWithUs for the time being and my unit seems to be more stable now. Still using CallCentric on SP1. But I have made a few other minor tweaks along the way so hard to say what was actually going on. Stay tuned...

CoalMinerRetired

I reached the 14 day uptime milestone today. So I decided to carefully re-enable both Callcentric SPs.

Did so via the ObiTalk wizard for Callcentric.  After doing so, made a few minor (from my pov) expert mode changes: for both CC SPs unchecked skip call screening ('cause I'm doing the GV & CC with CNAM thing), and for one of the two SPs (SP2) had to change the ring Profile to A, and default_ring to 7.  Strangely, SP2 defaulted to Codex Profile B and Ring Profile B, whereas SP 1, 3 and 4 default to profile A.

By my count I have the two suspect Callcentric SPs now setup with only four small tweaks that vary from the ObiTalk portal wizard. 

I hope the next update I add to this thread is to report success.  If not I'm going to seriously consider giving up.

ProfTech

I tried several ATA's before I ended up with the Obi 110. It is the only adapter at its price point that supports the PSTN. I tried an SPA 3102 which hasn't been updated since 2009 and it was junk as far as the PSTN was concerned, and apparently no longer supported. After several firmware revs I am pretty happy with the 110 overall. I hate to say it, but maybe the Cisco SPA122 might be a better choice than the 202. It is new but several s/w revs have gone through so maybe they are getting the bugs worked out as well. Except for the 202 having Google Voice capability I think they are pretty close to apples to apples.

CoalMinerRetired

One more update in this never ending series of frustrations.  I also want to post this before moving on to a new firmware in case the issues just disappear ... or god forbid get worse.

In summary, it's still not fixed, but evidence is pointing to the two CellCentric SPs interfere with each other, in some as yet unexplained manner.

The details: I formally reported the issue to Obi Customer Support.  Getting a direct answer, or even a response after multiple mails has been a challenge, which of course adds to the frustration. Support replied with the usual: try **0 Option 27 (100 Mp full duplex), then connect the Obi directly to my Cable Modem (no can do for days at a time in my case) and then 'try a new router'. My response to this last was "I just spent $150 on a new SOHO router four weeks ago, do I really need to try a third one?"

After some back and forth messages, many of which went unanswered and then some more prodding,  they replied to change the registration periods for the two CC SPs to 600 and 615 seconds. I tried that, but it was in vain, I believe, because on the System Status page for the two CC SPs it would show expires in XX seconds, where XX was never more than 60.  They did not acknowledge this in an email response (to the original 'try 600 and 615' reply), but since I was at wits end I deduced to try 59 seconds and 60 seconds, with the intent to change to 53 and 59 seconds (the greatest prime numbers less than 60) the next time I purposely reboot. 

The change to 59 and 60 seconds are correctly reflected in the CC SPs on the main status page, and the conclusion here is there's a 60 second maximum registration period (for CC or all SIP providers? Nothing I could fin din the admin manual). Even more encouraging, the change to 59 seconds and 60 seems to have 'reduced' the problem, but not completely eliminate it in my uptime test so far.

After six days of uptime, I had one episode with no dial tone and unresponsive via the 192.168.xxx.yyy Obi WebPage. After about a minute or two got dial tone back, and the 192.168.xxx.yyy started to respond. Unlike past episodes, this one did not result in a spontaneous mystery reboot, and uptime is now at 8 days, which is a new record. So those two points are progress (although small), in my experience.

In addition to the change to 53 and 59 seconds, (after I see how far I can go with the uptime using 59 and 60), I think I'll add two backup DNS servers, Google Public DNS, which I think will be used in priority order ahead of my ISP's (Comcast) dynamically assigned DNS servers, based on the way I read the Admin Manual.

ProfTech

FYI, the maximum re-register time for CallCentric is 120 seconds. The thing is, the Obi 110 re-registers when 1/2 of whatever time you specify has expired. So the 110 will register every 60 seconds if you have it set to 120. You can go as big as you want but anything 120 or larger will cause the unit to register every 60 seconds. I looked at this a bunch a year or so ago and ended up setting mine for 120 every time. Not sure if the 202 works the same or not. The Mediatrix adapter I looked at allowed you specify how far in advance of expiration you wanted to re-register but I ended up sending it back because it kept dropping registration and wouldn't stay registered at all.

CoalMinerRetired

Now that you mention it, I think you (or someone else) posted this 1/2 the max time thing somewhere on here before.

My update here is the reboots no longer happen (and I'm not exactly sure why, several recent firmware releases play into this) but I still get the predecessor to the reboot, dead air and an unresponsive unit that will not 'hangup' or otherwise end a call for several minutes. Looking back at the timestamps on this thread, I've been fighting this since February, and at it's height it was incredibly frustrating.

This past weekend I setup a new approach. I previously had one Obi202 with
SP1 = GV1, SP2 = CC1
SP3 = Gv2, SP4 = CC2,

The two CC are used for the CNAM and E911.

I have a spare Obi202 available for a few weeks (from a friends and family installation that is postponed for a while), based on some things Obi support suggested to try (see above posts) so I decided to try this arrangement with only one CC SIP on one Obi:
Obi202 #1: SP1 = GV1, SP2 = CC1
Obi202 #2: SP1 = GV2, SP2 = CC2.
SP3 and SP4 are not configured on either unit.

I had high hopes for this, but alas today I got another episode of dear air while on an important conference call. The call was outbound on GV2, so CC was not in use.  
I hope this was a one-off fluke, because it did not behave like the past episodes did, this one just became dead air, with no reboot, and unlike in the past I could hangup the call, however for several minutes neither GV1 or GV2 would accept an outbound call, just more dead air when anything was dialed.  In hindsight I should have tried 933, which is the only outbound call I can make on a CC SIP, all other calls are incoming only on CC.  

I am aiming for 10 days uninterrupted uptime as something of a huge milestone, but today's episode shattered that idea. This week is heavy on conf. calls so we'll see how the remaining seven or eight days go.

ProfTech

This is kind of an old thread but I thought I would post an update. After reading some posts about Callcentric issues I decided to try disabling ProxyServerRedundancy and SecondaryRegistration for Callcentric, while making sure X_DnsSrvAutoPrefix was checked. I'm happy to say the crashes have stopped. At this point I have been unable to detect any other differences between 2774 and 2776. I'm assuming the differences are minor since Obihai has never posted an update for 2776.