[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Topic: [feature] Is it possible and acceptable to add a recommendation algorithm?

Posted under Site Bug Reports & Feature Requests

Requested feature overview description.

For a long period of time, I must view dozens of artworks to find my ideal pieces of work. To be honest, this may become hard sometimes, when certain search continues for more than 20 to 30 minutes without result.

So it came to my mind as I was surfing in some random websites with recommendations: With tags and favs available for public in e621, it can be practical to deploy a recommendations algorithm, and even a brief one may bring a huge difference to posts searching. While related tech becomes more accessible today, it may not be too hard to have one on this site, and many of the algorithms have reasonable computation costs.

However, this suggestion may encounter certain issues:

1) Privacy concerns: the algorithm uses data like favs of users, while data provided to it would be public accessable in most cases, there exists the possibility that some users may find it disturbing.

2) Costs, mostly on computation resources here: it WILL cost, which can be controlled, but certain issue may not be acceptable for all organizations.

3) Works on rearranging site structures: it can be very simple, just adding a sorting tag representing recommendation results, or much more complicated as adding a whole page to this site.

So this is my thoughts on ways to improve experiences, and as a CS student, I can help with this idea if possible!

Best regards!

Why would it be useful?

A recommendation feature is very helpful! Saving many times on searching! If it helps a lot, why not?

What part(s) of the site page(s) are affected?

Depends on scales of it, changes made to frontend can vary from tag adding to page adding. It is a change for backend mostly.

There have been a few third-party attempts at this but to my knowledge they're all kinda outdated or inaccessible, like topic #25639.

The site footer has a link to our github repository so you can just straight up implement this yourself if you want.

Ew no. My experience with algorithmic recommendations is that they actively make it harder to find the niche shit I actually want to see because they favor what's popular/posted at specific times/uses certain keywords.

Donovan DMC

Former Staff

regsmutt said:
Ew no. My experience with algorithmic recommendations is that they actively make it harder to find the niche shit I actually want to see because they favor what's popular/posted at specific times/uses certain keywords.

The closest we've gotten to a recommender was importing danbooru's recommender, which doesn't care about time or really amount of favorites, it cross references things you've favorited with things people with similar favorites have also favorited
I haven't actually seen it in production but I'd imagine it works decently well
It also doesn't change any existing parts of the site, it's a whole new page you have to visit

Though, it got denied multiple times internally citing a lack of perceived use

donovan_dmc said:
The closest we've gotten to a recommender was importing danbooru's recommender, which doesn't care about time or really amount of favorites, it cross references things you've favorited with things people with similar favorites have also favorited
I haven't actually seen it in production but I'd imagine it works decently well
It also doesn't change any existing parts of the site, it's a whole new page you have to visit

Though, it got denied multiple times internally citing a lack of perceived use

The Danbooru recommender gets very, very stale unfortunately. Mine is basically clogged with things from years ago, even when adding like a fifth more images to my total favorites.
I suppose it may well work better for those with favorites in the thousands or larger, but when it's in the hundreds, it seems difficult to influence.

Donovan DMC

Former Staff

quenir said:
The Danbooru recommender gets very, very stale unfortunately. Mine is basically clogged with things from years ago, even when adding like a fifth more images to my total favorites.
I suppose it may well work better for those with favorites in the thousands or larger, but when it's in the hundreds, it seems difficult to influence.

Considering it uses your favorites to reference other favorites, I'm not surprised it doesn't do well when you don't have many favorites

donovan_dmc said:
The closest we've gotten to a recommender was importing danbooru's recommender, which doesn't care about time or really amount of favorites, it cross references things you've favorited with things people with similar favorites have also favorited
I haven't actually seen it in production but I'd imagine it works decently well
It also doesn't change any existing parts of the site, it's a whole new page you have to visit

Though, it got denied multiple times internally citing a lack of perceived use

favs are for nerds.

also, I dunno, I feel like even if I did actually use the favorites system I don't think I could trust it to recommend anything useful. like, I imagine if I faved a bunch of the pages of a bunch of comics thar I like almost all of the posts are just going to be other pages from the same comics, a bunch of random garbage, with a couple semi-useable recommendations in between. so I'd have to just fill my favs with every single page of every single comic I've ever read just so that I don't get recommend all of the comics I've ever read.

lafcadio said:
There have been a few third-party attempts at this but to my knowledge they're all kinda outdated or inaccessible, like topic #25639.

The site footer has a link to our github repository so you can just straight up implement this yourself if you want.

Ok, I would attempt to work on it after I've got enough informations on this topic. But for data collection, it would be much easier if limitation on API request frequency can be extended for a bit, like 10~20 per second (any improvement in access rate would be ok, even changing from 1/s to 2/s can provide doubled efficiency).

Is it possible to do so? Or these data can be found elsewhere?

Donovan DMC

Former Staff

12hydrogen said:
Ok, I would attempt to work on it after I've got enough informations on this topic. But for data collection, it would be much easier if limitation on API request frequency can be extended for a bit, like 10~20 per second (any improvement in access rate would be ok, even changing from 1/s to 2/s can provide doubled efficiency).

Is it possible to do so? Or these data can be found elsewhere?

The request limit is already 2/second
You also definitely should not attempt to download all favorites through the api
At minimum we're talking millions of requests no matter which way you attempt to fetch this data
There's over 1 billion total favorites, fetching favorites per user is at minimum 2 million requests, many more since a lot of people will have >320 favorites
per post you'd have to scrape html, and at minimum that would be 5 million requests (less if you filter out 0 fav posts, but still a ludicrously high amount), many more since a lot of posts have >320 favorites

You might be able to convince the sysadmin Dari to give you a dump of the table if you ask nicely, but what Lafcadio meant is implementing the feature and making a pull request on the github repository

But as I've said I've already personally attempted this and got denied

I think it's a decent idea as it's own separate, isolated page, like the Hot or Popular pages. Hard no to any algorithm slop beyond that though. Like regsmutt said, algorithms tend to shit up the experience. Which also results in users trying to game the system and get the upper hand.

Though if we're going that route at all, we might as well just let people favourite tags so they can customise what shows up. Maybe let them add weights to each too. Don't really see much point in *just* a recommendation section when we have all the search syntax and a plethora of tags. You can very much narrow down what you're looking for unless it's something very abstract or nebulous