# HMO text encoding for metadata



## wr0ngway (Jan 4, 2014)

Does anyone know what charset/encoding the xml response for HMO requests need to be in? I'm running into a encoding display issue with my tivohmo ruby project ( https://github.com/wr0ngway/tivohmo ).

The problem I'm seeing is that metadata (specifically description) that I fetch from plex, has some unicode values (specifically U+2019 - which is a apostrophe), and these show up as empty boxes when browsing the item in the tivo HMO browse UI. However, once I transfer it to the tivo, it does show up correctly as an apostrophe in the details for the show in the "My Shows" UI.

I'm doing everything in utf-8, and both the QueryContainer and TVBusQuery responses are in utf-8. I've tried setting Content-Type response header to 'text/xml', 'application/xml', 'text/xml; charset=utf-8', 'application/xml; charset=utf-8', but makes no difference.

The only place I don't use utf-8 is when transferring the file to the tivo - I generate the TVBusQuery xml as utf-8, but dump the raw bytes into the header since it is a binary stream - which may be why it works in the tivo UI, but not the hmo ui ...? https://github.com/wr0ngway/tivohmo/blob/master/lib/tivohmo/server.rb#L163

If I manually create a pyTivo metadata file containing that string, the same things happen.

Any ideas?


----------



## davidblackledge (Sep 9, 2008)

Well, my first thought is the HMO app is probably implemented with HME, and the fonts available to HME don't handle a lot of characters right (I have a whole character downgrade mapping I do for EWz so it can display pages with fewer boxes), but the HD UI has better fonts.
I'd bet if you had a file name with such a character served by TiVo Desktop, you'd probably have the same behavior.


----------



## wr0ngway (Jan 4, 2014)

Ok, thanks, that makes sense. I guess I'll just add transliteration whilst browsing, but leave the full utf8 string in place for the transfer.


----------



## davidblackledge (Sep 9, 2008)

wr0ngway said:


> Ok, thanks, that makes sense. I guess I'll just add transliteration whilst browsing, but leave the full utf8 string in place for the transfer.


Here's what EWz currently uses:

```
fixTextMap.put((char)0x00a0, " ");
			// en/em space, thinspace
			fixTextMap.put((char)0x2002, " ");
			fixTextMap.put((char)0x2003, "  ");
			fixTextMap.put((char)0x2009, " ");
			
			// commonly used horizontal ellipses character (&hellip;)
			fixTextMap.put((char)0x2026, "...");
			// quotes
			fixTextMap.put((char)0x2018, "'"); // ' left
			fixTextMap.put((char)0x2019, "'"); // ' right
			fixTextMap.put((char)0x201C, "\""); // " left
			fixTextMap.put((char)0x201D, "\""); // " right
			// em-dash (and en-dash) appears to be a problem, too
			fixTextMap.put((char)0x2013, "-");
			fixTextMap.put((char)0x2014, "--");
			// how often do the rest of these get used? just slows down the conversion IMO.
//			fixTextMap.put((char)0x201B, "'"); // high 9
//			fixTextMap.put((char)0x201A, ","); // low 9	
//			fixTextMap.put((char)0x201F, "\""); // " high 9
//			fixTextMap.put((char)0x201E, ",,"); // " low 9
			fixTextMap.put((char)0x2022, ""+((char)0x00B7)); // bullet - use mid-dot
			fixTextMap.put((char)0x2122, "tm"); // trademark
//			fixTextMap.put((char)0x301D, "\""); // double prime left
//			fixTextMap.put((char)0x301E, "\""); // double prime right
//			fixTextMap.put((char)0x301F, ",,"); // low double prime
```


----------



## wr0ngway (Jan 4, 2014)

Thanks - I'm actually using iconv for this, but I fed a number of your examples through it to make sure it was doing the right thing. Strangely enough, iconv has been deprecated in ruby for a while, but I still haven't found anything else that transliterates as well as it does, so I'm sticking with it for now 


```
def transliterate(s)
        converter = Iconv.new 'ASCII', 'UTF-8'
        converter.transliterate = true
        converter.conv(s) rescue s
      end
```


----------

