The recording will be available later, but if you're interested in enabling a global consistent view on hashtags, know stuff about DHTs or #ActivityPub relays, you can have a look at the paper: https://git.orlives.de/schmittlauch/paper_hashtag_federation/src/branch/master/paper_hashtag_federation.pdf
Please contact me about any questions, remarks or other feedback!
So here's a TL;DR:
The problem about the current state of hashtags in the fediverse is that users have a fragmented view on posts, depending on their instance. Some posts containing a hashtag never reach your own instance, so you won't see it. This is bad for decentralisation and coordination.
I plan to throw some additional P2P stuff at the fediverse: All instances are distributing the responsibility for relaying or storing posts (just their IDs) among themselves using a Distributed Hash Table. ->
Now we a defined entities in the network where we can subscribe to posts of a hashtag, query their history and send posts to when publishing.
I considered load balancing, redundancy and security concerns. My solutions may not be perfect, but they're the best I could have come up with.
Btw, he slides of my #ActivityPubConf talk are available here: https://git.orlives.de/schmittlauch/paper_hashtag_federation/src/branch/master/talk-slides.pdf
The recording of my talk at #ActivityPubConf on Hashtag Federation is now online.
It is a nice overview of my work.
@schmittlauch Wow, much respect for thinking about this. It's the largest problem IMHO since the beginning.
I still think the solution is separate software so that every platform does not have to implement more complex stuff than extra search REST API calls. I have not read your (long!) paper yet.
My current preferred solution is a network of index servers that share data over a p2p or similar. Platforms communicate with them over REST API's.
@schmittlauch the index servers wouldn't need the whole content of the posts, only 1) ID and 2) hashtags. This would allow platforms to make search requests for ID's and then they can fetch the ID's for actual content. This would make running the specialized index servers lightweight and would not introduce additional complex p2p development requirements to platforms themselves. Implementing AP alone is hard enough.
@jaywink I also propose the separation into a transparent application proxy component, lain even suggested to implement it as a relay.
Regarding "index servers" it depends on what you mean by that: If the index servers are supposed to crawl the Fediverse themselves then good luck with keeping up with its load: At the scale of that'd be ~140,000 posts/second. Furthermore, the indexers might not even know each server.
Thus I propose that as an opt-in instances actually push ->
@jaywink their published post to the responsible indexing server. Though it's not an indexing server but just a relay server, which will itself forward the post to a longer-term indexing/ storage server.
For distributing the load and avoiding a central point of authority or failure, I make each server just responsible for a subset of hashtags to handle.
I'll let you know once the recording of my talk is released, in case you need an easier start than a 24 pages paper ;)
@schmittlauch Great, sounds good 👍 Yeah no totally didn't mean crawling, I mean push from opt-in servers, just like the current relays work (well, the diaspora one at least, not sure how the AP ones work).
Just to understand, the network of hashtag servers would support anyone hosting one as part of the network? So you don't mean that some organization is in charge of hosting it?
@schmittlauch Science papers are a bit difficult for me - I tend to have a really hard time focusing after a page. So looking forward to a tl/dr ;)
@schmittlauch BTW just reading your talk slides.
One error: the #diaspora relay allows subscribers to choose either "all" or "a list of tags". Your talk indicated only the former.
Slides look good, though most of the science speak goes over my head ;) I think the important thing to consider is that whatever the complexity of the relay/indexers/distributed nodes, the API towards implementing social platforms MUST be simple and easy to understand.
@schmittlauch Great initiative. Since I see that there are instances which require that users post OT content as "Unlisted" it is difficult to spread a message beyond the own group of followers or direct instance peers.
I wondered how would the hashtags be accessible in the UI? Would that be like a separate timeline, or would subscribed hashtags appear in the Home timeline?
@schmittlauch Thanks for the notification.
@schmittlauch fleissig fleissig. Weil ich zu faul bin. ? Hashtag anstatt Links? Konnte mich nie richtig daran gewöhnen. Kommt mir komisch vor. Naja Suchworte,Schlüsselworte gabs ja schon immer
@schmittlauch And I thought it was a great talk.
@Ahuka thanks =)
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!