# TiVo Frozen - any suggestions?



## simon (Oct 7, 2002)

SWMBO was watching eastenders when the screen just froze (like you had pressed pause) 

I was able to telnet into TiVo, but I couldn't see anything strange, except that all the logs in /var/hack stopped at 19:12

I couldnt access TiVoWeb, and pressing buttons on the remote didnt change the colour of the front panel LEDs

I rebooted the TiVo, and it came up fine, but eastenders had actually stopped recording at 19:12, any ideas as to what could have happened? 

Any suggestions as to what to do if it happens again?

Thanks

Simon


----------



## Ian_m (Jan 9, 2001)

Look in the kernel log (and okernel by now) for DMA. When I had a misbhaving disk I would get freezes and get DMA errors recorded in the kernel log.


----------



## simon (Oct 7, 2002)

Ian_m said:


> Look in the kernel log (and okernel by now) for DMA. When I had a misbhaving disk I would get freezes and get DMA errors recorded in the kernel log.


 Ok, I'll check that out


----------



## simon (Oct 7, 2002)

I only noticed entries in the kernal log, and in the tvlog round about the time of the freeze

Kernal Log:


```
May 15 19:00:15 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:00:58 (none) kernel: Key does not exist. 
May 15 19:00:58 (none) kernel: mediaswitch: returning 0 from standin tune after tuning to ch -3 with adjust 1 
May 15 19:00:58 (none) kernel: Done with this packet 
May 15 19:01:30 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:02:45 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:04:00 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:05:15 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:06:30 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:07:45 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:09:00 (none) kernel: tcp_keepalive: call keepopen(0x80392300) 
May 15 19:10:15 (none) kernel: tcp_keepalive: call keepopen(0x80392300)
```
tvlog: 

```
May 15 19:08:12 (none) Scheduler[161]: Mempool highwater 95492
May 15 19:08:12 (none) Scheduler[161]: Abr-- DataChanged:0x00090009
May 15 19:08:43 (none) Scheduler[161]: Mempool highwater 95492
May 15 19:08:43 (none) Scheduler[161]: Abr-- DataChanged:0x00090009
May 15 19:09:02 (none) Recorder[159]: Adding check schedule task
May 15 19:09:15 (none) Scheduler[161]: Mempool highwater 95492
May 15 19:09:15 (none) Scheduler[161]: Abr-- DataChanged:0x00090009
May 15 19:09:47 (none) Scheduler[161]: Mempool highwater 95492
May 15 19:09:47 (none) Scheduler[161]: Abr-- DataChanged:0x00090009
May 15 19:10:18 (none) Scheduler[161]: Mempool highwater 95492
May 15 19:10:18 (none) Scheduler[161]: Abr-- reason changed from 'UI' to 'None'
May 15 19:10:18 (none) Scheduler[161]: DISK SPACE: Total: 106274 Live cache: 1930 Overhead: 378
May 15 19:10:18 (none) Scheduler[161]: TIVO CLIPS DISK SPACE: Total: 9765 Overhead: 64
```


----------



## Mike B (Sep 16, 2003)

Did TiVo continue recording after a reboot, or did the recording not restart?


----------



## Ian_m (Jan 9, 2001)

I've got similar entries on my logs, so nothing wrong with yours.

Is your TiVo bigger than 137GB pe disk ? If so did you install the lLBA48 kernel ?


----------



## simon (Oct 7, 2002)

Mike B said:


> Did TiVo continue recording after a reboot, or did the recording not restart?


 Yes, the recording started immediately upon boot again


----------



## simon (Oct 7, 2002)

Ian_m said:


> I've got similar entries on my logs, so nothing wrong with yours.
> 
> Is your TiVo bigger than 137GB pe disk ? If so did you install the lLBA48 kernel ?


 I did install the LBA48 kernal, although the disk is just 120Gb.

I also have a cachecard, but I just have 128Mb of RAM in it at the minute - I don't know if that would make any difference.

My wife tells me it has just started playing up again 

Simon


----------



## Ian_m (Jan 9, 2001)

simon said:


> I did install the LBA48 kernal, although the disk is just 120Gb.


120GB sounds very small for a modern disk, is it getting old ?

From Telnet type

```
/var/hack/bin/smartctl -A /dev/hda
```
And if Reallocated_Sector_Ct is anything other than zero and/or Power_On_Hours is large (say 30000 hours or above, 3 1/2 years) you might want to consider getting your credit card out now and using it before the disk fails. Might also want to consider replacing the PSU as well if changing the disk. Various people will sell you a pre-configured disk and PSU which is the easiest repair option.


----------



## simon (Oct 7, 2002)

Ian_m said:


> From Telnet type
> 
> ```
> /var/hack/bin/smartctl -A /dev/hda
> ```


Ok, I've gone and downloaded smartctl.

Power_On_Hours is currently 17620, so hopefully that is not too bad.

It is possibly the disk, that would be a nice problem, as I would like a quieter disk anyway...


----------



## djb2002 (May 1, 2006)

When I try that I just get:

ERROR: device does not support Self-Test function


----------



## Ian_m (Jan 9, 2001)

djb2002 said:


> ERROR: device does not support Self-Test function


Either a typo or you need to download the correct version of smartctl.

Simon what was you Reallocated_Sector_Ct ?


----------



## simon (Oct 7, 2002)

Ian_m said:


> Simon what was you Reallocated_Sector_Ct ?


Reallocated_Sector_Ct is 175 - what does that mean?

In case of any further enquiries, I have put the complete output below

Simon


```
smartctl version 5.1-9 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE     WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027   201   200   063    Old_age      -       17195
  4 Start_Stop_Count        0x0032   253   253   000    Old_age      -       730
  5 Reallocated_Sector_Ct   0x0033   236   236   063    Old_age      -       175
  6 Read_Channel_Margin     0x0001   253   253   100    Old_age      -       0
  7 Seek_Error_Rate         0x000a   253   252   000    Old_age      -       0
  8 Seek_Time_Performance   0x0027   253   244   187    Old_age      -       53095
  9 Power_On_Hours          0x0032   227   227   000    Old_age      -       18722
 10 Spin_Retry_Count        0x002b   253   252   157    Old_age      -       0
 11 Calibration_Retry_Count 0x002b   253   252   223    Old_age      -       0
 12 Power_Cycle_Count       0x0032   252   252   000    Old_age      -       777
192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age      -       0
193 Load_Cycle_Count        0x0032   253   253   000    Old_age      -       0
194 Temperature_Celsius     0x0032   253   253   000    Old_age      -       47
195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age      -       1559
196 Reallocated_Event_Count 0x0008   253   253   000    Old_age      -       0
197 Current_Pending_Sector  0x0008   253   253   000    Old_age      -       0
198 Offline_Uncorrectable   0x0008   253   253   000    Old_age      -       0
199 UDMA_CRC_Error_Count    0x0008   199   182   000    Old_age      -       28
200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age      -       0
201 Unknown_Attribute       0x000a   253   244   000    Old_age      -       24
202 Unknown_Attribute       0x000a   253   252   000    Old_age      -       0
203 Unknown_Attribute       0x000b   253   252   180    Old_age      -       0
204 Unknown_Attribute       0x000a   253   252   000    Old_age      -       0
205 Unknown_Attribute       0x000a   253   252   000    Old_age      -       0
207 Unknown_Attribute       0x002a   253   252   000    Old_age      -       0
208 Unknown_Attribute       0x002a   253   252   000    Old_age      -       0
209 Unknown_Attribute       0x0024   196   195   000    Old_age      -       0
 99 Unknown_Attribute       0x0004   253   253   000    Old_age      -       0
100 Unknown_Attribute       0x0004   253   253   000    Old_age      -       0
101 Unknown_Attribute       0x0004   253   253   000    Old_age      -       0
```


----------



## Ian_m (Jan 9, 2001)

simon said:


> Reallocated_Sector_Ct is 175 - what does that mean?


It means get a new disk now !!! If you wait you may/will get a catastrophic disk failure and loose all recorded programmes.

What is happening is your disk is failing, loosing sectors and the drive is automatically marking them bad and "replacing" them with spare elsewhere on the disk. This on the fly subsitution can cause picture breakup on TiVo (with corresponding log entries) and sometimes just freezing requiring a reboot. You may find the playing the same programme that causes the "freeze" also increases the re-allocated count, implying there is something wrong with a certain area of the disk. The drive will only have a limited number of sectors available as replacements before it starts loosing data.

We have embedded systems that monitor this number and if not zero tag the system as failed. In my experience generally this number always increases till suddenly the disk becomes unusable.

I bet if you run the manufacturers diagnostics on the drive in a PC it will inform you of impending drive failure.

I might waffle on about this but I ignored the above, after getting 2 re-allocated sectors a month or two before going away. TiVo crashed day 1 of two week holiday, after reboot when I got back drive had 14 reallocated sectors, I ignored that, and one afternoon a month or two later all went haywire hanging/freezing and finally hanging at "nearly there screen" as the A drive completely failed. Purchased a pre-configured 200GB drive and 2 days later TiVo back up and running. Then added another 300GB to feed my Mode 0 addiction..... Anyway the new Seagates are silent compared to the old 80GB Maxtor screeching devices!!!!


----------



## Ian_m (Jan 9, 2001)

One other thing your powercycle count is rather high 777 for 17,000 hours. My dying disk was only 69 in 33,000hours.

My current disks are 25 cycles in 6000hours, but then I have/still are having a new kitchen fitted and power during the day has been intermittant (+ I keep flicking the wrong breaker !!!).


----------



## simon (Oct 7, 2002)

Ian_m said:


> One other thing your powercycle count is rather high 777 for 17,000 hours. My dying disk was only 69 in 33,000hours.


That disk was in my PC in the days when I switched it off, so that would probably account for the high number of cycles.

Thanks to all for the advice!

Simon


----------



## Ian_m (Jan 9, 2001)

If you can get it on your PC with Windows running (yes I know it will overwrite something on the disk and stop TiVo working) but you can then perform the long SMART self tests. Can't do this whilst in TiVo as test gets cancelled as TiVo continually accesses the disk.

From command line type (there are CMD32 versions of smartctl):-

```
smartctl -t long /dev/hda
```
wait a while (as indicated in first command) then type 

```
smartctl -l selftest /dev/hda
```
to show the logs. Includes a lot more than the TiVo version of smartctl shows, shows things like no of hours before first reallocated sector, no of hours for last couple of re-allocated sectors etc etc


----------



## djb2002 (May 1, 2006)

Ian_m said:


> Either a typo or you need to download the correct version of smartctl.
> 
> Simon what was you Reallocated_Sector_Ct ?


Definitely typed it correctly.

Do you know how to update it (without causing problems with anything else) ??

Thanks
Daniel


----------



## Ian_m (Jan 9, 2001)

djb2002 said:


> Definitely typed it correctly.
> Do you know how to update it (without causing problems with anything else) ??


Maybe your drive doesn't support SMART, which is very strange as even drives dated from late 90's have SMART builting.

Try 

```
/var/hack/bin/smartctl -i /dev/hda
```
you should get something like this

```
smartctl version 5.1-9 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     ST3200822A
Serial Number:    4LJ0V3LN
Firmware Version: 3.01
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Thu May 18 07:59:34 2006 localtime
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
```


----------



## djb2002 (May 1, 2006)

Thanks for your reply.

The only location I can find smartctl is in /var/bin.

When I do that command I just get:


```
Device: IC35L120AVV207-0  Supports ATA Version 6
Drive supports S.M.A.R.T. and is enabled
```
Thanks
Daniel


----------



## Ian_m (Jan 9, 2001)

You need to find the later version of smartctl for TiVo then.


----------



## djb2002 (May 1, 2006)

Ian_m said:


> You need to find the later version of smartctl for TiVo then.


Hi Ian,

Thanks for your reply. - I'll give that a go 

Daniel


----------



## djb2002 (May 1, 2006)

I've just uploaded it, and tried again, but just get the same.

Is there an easy way to see if it is picking up an old copy from another location ??

Thanks in advance,
Daniel


----------



## djb2002 (May 1, 2006)

OK, I found another copy of smartctl and upgraded to the version on this thread.

When I try and run it I get the following error:

BUG IN DYNAMIC LINKER ld.so: dynamic-link.h: 46: elf_get_dynamic_info: Assertion
`! "bad dynamic tag"' failed!

I have tried downloading again and reuploading, but still get the same ??

Any ideas ?

Thanks again
Daniel


----------



## Restorer (Jan 6, 2002)

Ian_m said:


> Either a typo or you need to download the correct version of smartctl.


Where to get this please and how/where to install it?


----------



## djb2002 (May 1, 2006)

Ian_m said:


> And if Reallocated_Sector_Ct is anything other than zero and/or Power_On_Hours is large (say 30000 hours or above, 3 1/2 years) you might want to consider getting your credit card out now and using it before the disk fails.


I've managed to get the smartctl working on my Tivo.

The 'Reallocated_Sector_Ct' is showing as 1 on both by hda and hdb drives.

Does anyone know exactly what this means?? - Is this bad sectors ?

Thanks
Daniel


----------



## Ian_m (Jan 9, 2001)

djb2002 said:


> ...The 'Reallocated_Sector_Ct' is showing as 1 on both by hda and hdb drives.
> Does anyone know exactly what this means?? - Is this bad sectors ?


In my experience anything other than zero is bad news.....

This means the drive has detected a duff sector and assigned a new one to replace it (sometimes on same drive "cylinder" or sometimes at edge of disk). As I said before we have embedded PC systems that count anything other than zero as a failure and require a drive replacement, preferably sooner rather than later.

One of my TiVo drives had 'Reallocated_Sector_Ct' of 2 for ages (9 months) before, as would be expected, it suddenly got larger, TiVo rebooted/froze a couple of times then died as the disk expired.

Its a shame the TiVo version of SMARTCTL is not complete/full as it cannot retreive the full logs (smartctl -l selftest /dev/hda), so you could asertain how old the drive was when the error occured and base your disk replacement timescale upon if it is recent or old error.

Are your drives mounted on the bracket ? Reason I ask is one of our customers had an excessively high drive failure rate and this was traced to bolting the drive straight to the system chassis. Mount drive on a carrier plate/cradle bolted to chassis and suddenly, no failures.


----------



## barbrook2 (Jun 7, 2006)

The title to this thread sums it up. My TiVo is freezing every day. It doesn't respond to the remote and there is no tivoweb or telnet.

Ive already put in a new power supply to no avail and have downloaded and run smartctl. 'Reallocated_Sector_Ct' is showing 0, so I guess the disk is OK. It's a 250Gb Samsung and about 18 months old.

Does anyone have any suggestions as to what I could try next?

Thanks


----------



## blindlemon (May 12, 2002)

barbrook2 said:


> T 'Reallocated_Sector_Ct' is showing 0, so I guess the disk is OK.


Not necessarily.

The TiVo OS doesn't have the smarts to force the drive to reallocate bad sectors so your drive could be a mess!

I would pull the drive and check it with HUTIL. If it has bad sectors then it shoud be replaced. You can get a warranty replacement from www.rexo.co.uk in about a week as long as it is less than three years old :up:


----------



## barbrook2 (Jun 7, 2006)

Thanks for that. I'll have a look at hutil and see what it comes up with.

BTW the disk was supplied as an upgrade from tivoheaven


----------



## blindlemon (May 12, 2002)

barbrook2 said:


> BTW the disk was supplied as an upgrade from tivoheaven


If you would like me to handle the RMA and reconfigure it then just drop me an email or PM


----------



## anyoneinracks (Jan 20, 2003)

Would you like to try my tale of woe?
My series 1 TIvo with 120G disk, Tivoweb, Endpad and Tystudio has been running beautifully for many years. It has now started freezing - usually on a Monday evening. It says it is recording, but does not respond to the remote in any way, and Tivoweb does not work. On reboot it continues to record as it should have been. The last bit of recorded program does seem to have a hole in it.The logs dont show anything that looks odd to me:


> /var/log/Otverr/
> Nov 17 08:20:49 (none) TClient[421]: failed connect - aborting
> /var/log/Okernel/
> Nov 17 20:13:53 (none) kernel: tcp_keepalive: call keepopen(0x802fb060)
> ...


Everyone seems to blame the hard drive, so I tried smartctl. Again it all looked quite good:


> = START OF INFORMATION SECTION ===
> Device Model: SAMSUNG SV1203N
> Serial Number: S01CJ10Y221454
> Firmware Version: TQ100-30
> ...


Any other thoughts??


----------



## Logan (Mar 19, 2004)

My Tivo has started freezing as described above as well ! ?
I deleted the last 10 hrs of progs & that did nothing, still freezing after a few hrs & OK on reboot for a few hrs.
Then deleted ~50hrs of progs - and it hasen't frozen for a few days now.


----------



## ColinYounger (Aug 9, 2006)

How very odd. My main TiVo hung while recording Monday as well, as well as hanging yesterday. Logs show nothing exciting.

Perhaps there's a bit of dodgy data from the daily call filtering through?


----------

