One of my users reported that even tho they typed out the full name of the streamer in my app, the correct channel was only the 17th in the result list. When I’m using the API, I take the first 50 results, this is important.
https://api.twitch.tv/helix/search/channels?query=lenamoon&first=50
My user used a channel called lenamoon
as an example so I will use her channel in my example as well.
First, here is how this query result looks on the official site:
https://www.twitch.tv/search?term=lenamoon
It is the first result.
Obviously, the site is not using the API that we have to use as developers, but still, Twitch can search if it wants to.
I did some tests and noticed something. The larger the expected search result is, the less relevant the search is. So if first=1
, the only result is Lena’s channel, but if first=50
, Lena’s channel is the 17th.
Now, my expectation is, the quality of the search result should not depend on the size of the requested result list’s size. Maybe I’m wrong here, but I don’t think it’s a crazy expectation.
So I did a few tests, with different first=<value>
, searching for lenamoon
and showing the position of her channel in the search result.
first=1
→ 1st
first=2
→ 1st
first=3
→ 2nd
first=4
→ 2nd
first=5
→ 2nd
first=10
→ 3rd
first=20
→ 7th
first=50
→ 17th
first=100
→ 37th
If I use my name as a search query, I get roughly the same results. This is not unique to her.
In case of the first=50
, 10 channels are ahead of her, and these channel don’t even contain the word lenamoon
in their name…
first=50
LeonaMoon
LenaMoongrove
lenamoore
lenamoor
LenaMoonyear
leanmoon2125
lenamoon1981
lenamoow
leamoonsie
Leanmono09
lenamoon07
lenamoonlight11
LeaMoonchild
lenamoneee
lenamoon96
lenamoonlove
LeaMoonlight
lenamoon
But that’s nothing, let’s go back to the documentation: Reference | Twitch Developers
Gets the channels that match the specified query and have streamed content within the past 6 months.
Ok, let’s see if this is true and idd, those 17 people did stream in the past 6 months.
first=50
Out of the 17 people:
- 10 of them NEVER STREAMED
- 1 in 2017
- 2 in 2021
- 1 in 2022
- 3 streamed in the past 6 months
At least with kraken
we got the channel’s followerCount
, so you could order your search result by that and get a more “relevant” search result. Since helix
we don’t have this luxury, because followerCount
is no longer included.
As a closing going back to first=50
, non of those 7 people who streamed EVER are partners, or have more than 250 followers. Lena is a partner with ~5k followers.
Thanks for reading for my TED Talk, so can we get this fixed? Just kidding, probably won’t be fixed ever, just like all the buggy API endpoints, but at least I’ve tried.
P.S.: you could argue that fuzzysearch could fix this. And idd, it can fix this, as long as the user types out the whole name of the channel, but if the search_query
only has a part of the channel’s name, fuzzysearch will fall apart.
P.S.2: in the screenshot Lenamoonxo
is 37th (first=50
), and ilanamoon
is not even part of any of the search results that I did.
P.S.3: I wanted to include the whole json response for first=50
, but in this result her channel is 19th instead of 17th like in my initial test… wow
https://nopaste.net/2xovq9E3kp