Matching problems on shows with colon in the name

I’m having problems matching shows with colons in the title against TheTVDB.com.

Example: “H20:The Molecule That Made Us” is TheTVDB.com series ID 380647.
Because “:” (colon) is not a valid filename character in Windows (and maybe other filesystems too), it gets removed from the recording filename. For example, the HDHR DVR. It also lower-cases the “O” in H2O, but that isn’t a problem, and if it matched, it will be fine.

To me, it looks like the colon (“:”) is also removed from the metadata as well.
I don’t know how you would search TheTVDB.com not knowing there is a missing “:” that is preventing a match, or if you can put wildcards in the search or any-character markers in the match.

In regex, maybe on a failed match, try with “H2O? The Molecule That Made Us”, or whether trying to match by removing words at the front, e.g. “The Molecule that Made Us” would give you a match. I’m just spitballing from the things I tried to get a match manually in the Plex matching lookup.

It would be nice to be able to figure out there is a missing “:” in the title, and if you don’t get a match, try with the colon. It seems to me that TheTVDB.com could do better by also not including punctuation in titles to give broader matches, or to allow a secondary lookup without punctuation to see if that yields a match.

I have the same problem with “Power Trip: The Story of Energy” as well.
They don’t match without the colon, and only with the colon do they match.
And with that one, you’d have to repeat the “try with a any-character”, but it won’t match until the second word. And same if you started dropping words off the front looking for a match. The problem is you don’t know where that colon is that’s breaking the search.

Are there any clues in the HDHR metadata that can help detect the missing colon/punctuation, and on a failed lookup, try again with the missing punctuation/non-ASCII characters.

Just some grumbling about having to manually match. But if there is a way…

MCEBuddy does try doing a match after removing special characters but YMMV. If you have the logs I could try to replicate it.

It’s not a question of removing the special characters. It’s a question of putting them back in. :wink:
I think this was the source of problems with shows that have quotes and apostrophes in their titles and some guides would use the quote marks and others would use the “smart quotes” (open and close) characters and since the match wasn’t exact, I had to put corrections into MCEBuddy to force it to use TheTVDB.com show IDs (America’s Test Kitchen and Cook’s Country, etc.)

I’ll see about posting a log. Done 05/09. Linky

Notes looking through the log file:
Minor: There are some inconsistent uses of the line terminator. Not all lines end in ^M^L (CR-LF).
There’s this:

--> Extracted SiliconDust tags:
Title: Power Trip: The Story of Energy
:
IMDB Id: 
MovieDB Id: 
TVDB Id: 
Is Movie: False
Is Sports: False
:
2020-05-07T02:33:57 MCEBuddy.MetaData.VideoMetaData --> Video Tags extracted -> 
Title: Power Trip: The Story of Energy

So perhaps you might try a lookup in TheTVDB (and other sources) if the lookup on the filename doesn’t match anything. Also, do the overrides match against the filename, the metadata show “title”, or both?

OK, now this is weird. It is finding the show now. I’m thinking that because it is new, TheTVDB just added this show after I found the problem recording the first episode, and before I recorded this episode, the third episode.

TVDB Id: 381292

But when I looked it up in TheTVDB manually in the Plex “Match” search, it would not find it without the colon, and would with the colon. Hmmm.

Now I’m not so sure there is an issue. MCEBuddy seems to be finding the series in TheTVDB.
Thanks for offering to help, @Goose .

I had a look and it’s working just fine here. The issue isn’t the : it’s that fact that it’s a series with just an title and nothing else to match, no subtitle or air date or episode number etc.

Could it be that you searched TheTVDB with the “special characters removed” (no colon) title (or original filename title) and not the actual title (with colon) from the metadata? Or do you search both and first one wins?

I ask because almost all of the shows that don’t have much guide metadata are PBS OTA sources and the guide data that does get put into TheTVDB, IMDB, etc. comes from fans. PBS isn’t like the networks or studios that have marketing teams that make sure that their show guide data is widely published and populated.

Then there’s the problem of “erased” and “clean-washed” content (along with the metadata) in the aftermath of scandal, such as the cooking shows from Martha Stewart, Paula Deen, Mario Batali, John Besh, 15 seasons of Chris Kimball’s ATK/Cook’s Country, etc.

PBS/APB also doesn’t have a good track record for metadata for older shows, particularly on their Create channel, because the shows revolve around a particular personality/chef/star and the marketing was left to the indpendent producers back before streaming and getting your episode metadata into TheTVDB or IMDB (even less after Amazon took over and monetized it) became vital to marketing the DVDs or streaming them on-demand or curating home libraries the way they are now. For example, old shows from John Folse, Steven Reichlen (Project Smoke, BBQ-U, etc.) or paywalled shows from Food Network or HGTV.

I replicated your setup and took the title from the metadata with the colon.

Your issue isn’t to do with special characters, your metadata doesn’t have enough information to create a match. See my comment above:

You need two pieces of information to make a match.

The air date and record date is in the filename - it is a HDHR recording. How is it that I can search by name only on TheTVDB and it comes up with a series ID just fine?

MCEBuddy is now matching on both series, so no big deal. I’ll watch for another one to see if it is still happening.

No worries. I’m good.

If it happens again save the logs so we can get a proper look at this.
When matching a series it expects a title and subtitle (episode). Documentaries don’t have episodes so TVDB doesn’t return a match when MCEBuddy tries to match the air date.

Will do. Unfortunately, all the documentaries I record are broadcast in multiple parts, aka “episodes”. Usually 3 or 4, but sometimes more. e.g Ken Burns on Baseball, Jazz, Country Music, National Parks, etc. And this one on Power has 3.

I’ll see if I can catch it happening.