Hi Reddit, Mojeek team here, today we are extremely happy to look back on 15 years of not tracking the people using Mojeek to search on the Web. Search with surveillance was born in Silicon Valley, search without surveillance was born in Sussex, UK.

Here's that policy on Archive.org: https://web.archive.org/web/20060318104627/http://www.mojeek.com/privacy.html

And our proof: https://twitter.com/mojeek/status/1372109732408918019

EDIT: And our celebratory/history article for 15 years: https://blog.mojeek.com/2021/03/to-track-or-not-to-track.html

DOUBLE EDIT: Mojeek itself: https://www.mojeek.com/

GMT 00:00 - it's late here in the UK, so chuck in anything you have by way of questions and we will hit you back tomorrow, thanks!

Comments: 369 • Responses: 64  • Date: 

thebronzecat341 karma

I just searched for Mature redhead anal and most of the links it found were expired or didn't load at all. What gives?

HoppyBeerKid159 karma

One thing which is massively useful to us is the feedback button at the bottom of the page, if you search using Mojeek and the results aren't great then this is a very useful way of co-building/improving the search engine. Re-crawling is a part of what we do, and hopefully the new servers that we are putting in place at the moment will mean this will happen less and less.

nultero59 karma

Were there any concerns about the no-tracking promise attracting more questionable search patterns?

How do you deal with crawling or being used to search illegal content?

HoppyBeerKid67 karma

We do our best not to index that content so it doesn't matter if people search for it. It's not perfect but we're always trying to improve detection. The policy came about in order to protect vulnerable people (medical searches) and so that protection extends to the crawling aspect of what we do.

Ohuhhhsh1t15 karma

Why should medical searches not be watched? Can they do anything with that information?

HoppyBeerKid69 karma

The rationale is that the person is already in a place where they're probably concerned about something, not being tracked can at least do a tiny teeny little bit to put someone's mind at ease a bit more.

Can they do anything - probably sell you medical products or push you articles/content which could be deleterious to your mental health, maybe in some countries there would be issues re: insurance if the data leaked etc.

AFewStupidQuestions15 karma

Reddit does this. I had subbed a sub dedicated to exercising to relieve depression. Reddit kept pushing the same stupid ads about "Ignoring Your Mental Health? Get Treatment With Us" and "Feeling Suicidal?".

I had to unsub because it kept reminding me that I was depressed when I was busy attempting to stay occupied and in a good mood with positive stuff on here.

HoppyBeerKid5 karma

That's so not okay, I hope this change has helped.

Boatsnbuds30 karma

Medical information can be used against you in lots of ways. For instance a Canadian woman was denied entry to the US because ICE somehow found out she had had a prior history of mental illness.

HoppyBeerKid26 karma

Exactly this. The vast majority of government surveillance is corporate surveillance turned over. This is no reason for anyone's medical conditions to be exposed to and logged by a for-profit entity if it is not strictly necessary for a course of treatment or similar. Where this has to go through private entities (and I talk from a place where I have the NHS, and therefore do not have a deep experience of this) the amount of steps the data takes should be minimised, and the process should be as secure as possible.

Studoku229 karma

Am I gregnant?

HoppyBeerKid117 karma

Mojeek says it's hard to tell: https://www.mojeek.com/search?q=am+i+gregnant

oh_Restoration93 karma

Can u get... prregante?

HoppyBeerKid182 karma

At Mojeek we pronounce it preganté

snoggy_loggins140 karma

How do you make money?

HoppyBeerKid179 karma

Great question that one, we think that if people asked this question more then we'd probably be in a much better place. If you want to dive into this in a big way, we recently wrote a post about it, covering investment and actual revenue streams (https://blog.mojeek.com/2020/10/who-funds-mojeek.html).

People fielding you long articles to answer questions isn't fair though, so I'll cut it short here for you: we currently are testing a contextual adverts programme, much like some other non-tracking search offerings, we also have the ability to monetise our API, and have been looking at many other interesting and new options, like the payments for the Web models suggested by entities like Coil (https://coil.com/).

machinelearning_2 karma

How do plan on targeting ads? Do you think your policy on privacy has hindered possible revenue streams over the last 15 years? If you can’t target ads I think you’re essentially setting up an billboard and based on the traffic of your site I think you’ll have trouble getting more eyes on ads than a road side sign in a major city.

HoppyBeerKid3 karma

The very same, the way ads were done before deep tracking was added to the model. People still buy billboards, people still purchase TV ads, people still pay for pages in magazines. With our contextuals there is also an interest and a country-level IP component, but aside from that it's very similar to the mediums you cite which have been a part of advertising strategies from the past up until now. There are other non-tracking search engines who have managed to thrive on the same model, we are also augmenting this by continuing to build an index.

Temporarily__Alone2 karma

maybe reselling ramen?

HoppyBeerKid15 karma

Just the packets, it's chump change for foil, but we love noodles.

BadA55Name62 karma

What sets you apart from browsers like DuckDuckGo?

HoppyBeerKid114 karma

DuckDuckGo also provide non-tracking search, but they do not engage in crawling (the process by which an index of results is built). DuckDuckGo uses Bing and Yandex in order to provide the people using DDG with results (what we call a metasearch engine), whereas Mojeek runs servers which go out to the Web and collect the information needed to build an index. We maintain a pretty cool resource here: searchenginemap.com/ which shows you where results come from when you're using a search engine or metasearch engine.

GuyWithTheStalker43 karma

DuckDuckGo uses Bing and Yandex in order to provide the people using DDG with results (what we call a metasearch engine), whereas Mojeek runs servers which go out to the Web and collect the information needed to build an index.

How often does that happen, and how does that stack up against DuckDuckGo? I mean... Is Mojeek for babies; are you guys really, really ridiculously good at crawling?

HoppyBeerKid74 karma

I have never had someone make a link between web crawling and babies, I'm going to do the thing the accounting guy does in Parks and Recreation and re-use that over and over in the office until people get very annoyed.

It's just a different model of doing things, crawling is quite resource intensive as a process and if you wanted to compete against Google and Bing in any great way, the best time to start was a good while ago.

We normally say there are about seven crawler-index search engines in the world: Google, Bing, Yandex, Sogou, Baidu, Gigablast, and Mojeek. Basically anything outside of that will be a metasearch engine (multiple indexes pulled from) or a search service (just relying upon the index of one other entity).

molocasa23 karma

Can you share how you benchmark the quality of your search engine’s results across the competition? I feel the biggest barrier to switching from google is the feeling that I tend to get what I am looking for right away.

HoppyBeerKid19 karma

Right now our main input is from feedback. For mainstream queries it's straightforward and obvious when they're not up to par. It's more difficult for longer tail searches where we want to provide as relevant results, but also a wider variety of pages that you might not otherwise come across on search engines that use Google's or Bing's indexes.

Quethrosar23 karma

How do we know you don't track? Policy doesn't mean anything.

HoppyBeerKid49 karma

Of course, this is all about trust, we would hope that the quantity of time we've been about without a "Mojeek is selling/gathering user data" story coming out would be something which helps people to take us at our word. For a lot of people that won't be enough and, to be honest, we are very happy that people are that skeptical. That skepticism is going to be vital in all of us getting the Web that we want and need collectively.

We're always doing our absolute best to listen to the people interested in this area and the people using Mojeek, so to turn this on its head and make it a learning opportunity for us, what would assure you that we don't track?

MrWally3 karma

What do you mean “policy doesn’t mean anything”? Aren’t US/UE companies legally bound by privacy policies?

HoppyBeerKid6 karma

I believe this is correct; without creating major FUD, there are probably a decent number of people who don’t follow theirs, enforcement can be a bit lacking. We can’t understand the act of putting something on your website that you have no intention of following, but for sure it happens.

Leones10813 karma

Do you track your users?

HoppyBeerKid10 karma

Ahhhhhh, I see what you did there.

SolzGuy9 karma

Have you tried becoming one of Brave's default search engines yet?

HoppyBeerKid29 karma

We very much appreciate privacy as default browsers and Brave seems to have delivered this in a nice tight package (for me personally it is my solid second). This being said, it's hard to get on that list and for a lot of browsers a decent amount of money changes hands in order to get that positioning.

We have attempted to make contact a few times to no avail; with Brave currently looking to give their own search offering we're unsure of if Mojeek on Brave could ever happen as a default, we remain hopeful and are always happy to engage with other entities who share our values and view of how the Web could be better.

LovelaceReincarnate7 karma

Can you give low tech, low effort tips to have less data tracked online? Using a non tracking search engine is a good step, do you have others?

HoppyBeerKid20 karma

Everything starts, at least in my eyes, with your choice of browser. Get something which allows you to have extensions, and load up with Decentraleyes and uBlock Origin. I also have recently been messing around with a User Agent switcher (pretending my computer is different, my browser is different etc.) but to be honest that makes a lot of logging on to things difficult.

When it comes to sending messages, email is a pretty insecure medium, but your choice of provider will affect if they're using data from messages etc. in order to augment other sides of their business. If you're using social media, make sure you do it understanding that a lot of these platforms are fundamentally built with data -> targeted adverts as their bedrock. Personally I prefer secure e2e IM over anything else for talking to people.

Another personal one: people can get very into FUD when it comes to privacy, just always remember that it's a marathon, not a sprint. Also, you have my handle, please feel free to ping me about this if you have further questions when this thread disappears into the interwebs ether, I love helping people on this.

InSearchFor_I5 karma

Who?

HoppyBeerKid11 karma

To be honest mate, I do not have a clue.

HuskerNatChamps20205 karma

So since you do not "track" your users what do you do to your users in order to be a profitable company?

HoppyBeerKid10 karma

We are currently trialing contextual ads, so adverts that appear in the results next to search terms that are related to them. On top of this we also have API access as a revenue stream; I guess neither of these things are being done to our users per se. This is one of the big issues with the current targeted ads model for the Web, it sees people as products and customers/service users simultaneously, and these two things play off against each other. A big part of the reason why we do what we do is because we believe that there is a better way of providing search as a service.

badgerseattadpoles2 karma

[deleted]

HoppyBeerKid6 karma

Would you be able to chuck me a link to where it was avoided? Not a proof thing, but we wouldn't want to seem as if we're hiding anything. The article linked above outlines a lot of the history with this, as well as this piece which we put out at the end of last year. The answer is that up until this point we've been lucky enough to gain investment from people who understand the value of there being an entity who is going out and crawling the Web, rather than repackaging results from Bing (or to a lesser degree Google). A very good slice of that 15 years was just our founder himself burning the midnight oil between consulting jobs to keep the engine going, it takes a lot of grit and determination to keep going when it comes to this industry :D

badgerseattadpoles-5 karma

[deleted]

HoppyBeerKid6 karma

Every comment is a learning experience, this is no different. For you, specifically, what are the questions you feel are unanswered?

badgerseattadpoles-7 karma

[deleted]

HoppyBeerKid9 karma

Damn, I really am trying here. We're into a level of comment where very few people are going to see it so, human to human, which questions can I answer for you?

ogremadguy3 karma

Is that a dad-gum talking heads reference?

HoppyBeerKid1 karma

Re: u/colinhayhurst another person with the same analysis of the name :D

It is not a Talking Heads reference, it's actually a switched-up version of the word "logic" that our founder came up with way back when a name needed to be picked. This being said, a decent amount of the team are into Talking Heads - personally Remain in Light is my favourite one of their albums.

daffas3 karma

My biggest hang up about sites that say we're not tracking you or you can delete your data. Is that it always seems like they're hiding more than they tell you just to please people. What kindof data do you store from people that use your search? What other products do you use that offers similar privacy. And final question are you filtering results like Google does to only show what they want you to see?

HoppyBeerKid9 karma

Our current privacy policy details what we store

"(country, time/date, page requested, referral data, and in a separate log browser data)"

None of this is identifiable. Regarding products, do you mean for the search engine or as a team? For the search engine, none, it's not dependent on anyone else's services. Also, we definitely do not filter results to please any particular side.

daffas1 karma

Thanks for the reply! I was more wondering about software that you trust that doesn't have tracking etc. But it's nice to hear that you're not dependent on other services.

HoppyBeerKid3 karma

Our MO is to self-host whatever and whenever you can. Better to put the work in maintaining things running on your own servers than to wake up one day to learn that something has been hacked/had a leak/shut down etc.

cfernnn3 karma

Are your search results organically generated or do you manipulate the algorithms sometimes for certain purposes (equity, etc.)?

For example, if you Google 'happy white woman', the results are roughly 70% African American men and women.

In no way do I mean for this question to sound insincere!

HoppyBeerKid2 karma

Nope and we never have, it's 100% organic and all subject to same algorithm. We only alter the algorithm in order to try and improve it in the round.

EDIT: phrasing at the end.

fscknuckle3 karma

Why have I never heard of you in 15 years?

HoppyBeerKid5 karma

Our founder has been very focussed on crawling and building an index for the bulk of that time; it's only reasonably recently that the team has started to expand and include non-developer functions. Actually the article linked above contains within it a very good part of that history. Rest assured we'll be making a lot more noise in the coming 15 years.

Tatanbatman3 karma

Can you search big booty Puerto rican goddess?

HoppyBeerKid3 karma

Yes, yes you can.

JohnyyBanana2 karma

So are you just stating that you don’t track your users or do you actually not tracking your users?

HoppyBeerKid3 karma

We do not track, and have not been doing so for 15 years; this policy was put in place because our founder saw that people were putting sensitive medical queries into Mojeek, and he wanted to assure these, and other vulnerable people, that Mojeek was on their side.

JohnyyBanana3 karma

Excellent, kudos to you guys then!

(My question was sort of a joke btw, rereading it again im sorry if i sounded sarcastic)

HoppyBeerKid5 karma

Hey, it's all good! Sarcasm or not, you'd be surprised how many times "we don't track" is actually: "we don't track" /s sike!

JeveStones2 karma

Why do you feel search results shouldn't be tracked at the point of the search engine? How does this search preserve anonymity from surveillance? Do ISP's somehow not still have full insights into the traffic to and from your site?

Something like a VPN to provide anonymity be more effective for end users to preserve their anonymity if that's their focus. I just feel like your only selling point is anonymity which is not a main concern for most consumers. Most would rather see the focus placed on quality results.

HoppyBeerKid3 karma

Why do you feel search results shouldn't be tracked at the point of the search engine?

We don't think that a search engine shouldn't be able to know the queries people are sending to it, we just don't think that should be tied to personally identifiable or even roughly personally identifiable information.

Do ISP's somehow not still have full insights into the traffic to and from your site?

When it comes to privacy, there are multiple levels, if you are not taking other measures then it's possible that your ISP will have a large amount of information when it comes to your activities.

On the VPN, yes, this is an extra layer of security and should be used if you feel it is appropriate for you, but should people have to use a VPN to not be tracked by their search engine? I personally understand the value of this service, but I feel we could move faster towards the kind of Web that we want, if we design services to be financially and user-base viable without the need for every single privacy-conscious user of that service to be using a VPN.

Hirokuro2 karma

totally out of context, but what is the beverage of choice of your team?

HoppyBeerKid6 karma

For our recent Christmas get-together we had a video call chat and beer, courtesy of u/colinhayhurst, our CEO. So I would say that beer would likely be the answer to this, the team were quite split between lager and ale when it came to selecting which box though.

LovelaceReincarnate2 karma

Follow up to this, are the devs all buddies?

HoppyBeerKid3 karma

At least from my perspective everyone gets along well, we are a small, distributed team. One thing which we align on heavily is values, which imparts a lot of peace of mind.

Arcturion2 karma

Just tested the search engine. It seems to be strictly limited to searches conducted in the Latin alphabet and turns up no results with other scripts. Is the search engine targetted at the English speaking population?

HoppyBeerKid3 karma

Currently Mojeek is limited to most Romance/Latin European languages, and most Germanic languages. We have limited resources and so have to be a bit choosy, at least at the current moment. Just out of interest, which language would you need for a search engine to fit around your Web usage?

Arcturion2 karma

Japanese and their script; katakana, hiragana and kanji. Might be too big an ask given your explanation.

HoppyBeerKid11 karma

We understand that one of our biggest strengths as a small team is to be responsive to what the people using Mojeek want and need from it, so never say never.

I will smile fondly if and when I look at the next task in the queue and it says "Japanese language sites, requested by u/Arcturion."

Arcturion6 karma

LOL I will cheer for you.

HoppyBeerKid1 karma

The victory will be all yours, the nation of Japan's, and for other Japanese-speaking individuals regardless of where they're based.

I learnt some very basic Japanese for a trip a lifetime ago and it was perplexing to me, beautiful-sounding language though.

talldean1 karma

How much user benefit do you lose by not having context on what the user has recently searched for? Any guess?

HoppyBeerKid2 karma

That's very much an un-quantifiable thing, and it's also quite a philosophical question; when it comes to accessing information, do you think the majority of the time you want a list of sites that confirm your prior biases, or is it better to not have that happen? And then zoom out once again, is there a wider benefit to society of all people having the same list of results presented to them?

All of these questions are interesting because they're difficult to answer and the answer will change depending upon the context.

talldean2 karma

I get to cheat here, but I've seen the stats on just such a system.

You don't want the set of sites that confirm your biases. You want the set of previous *queries*, so that you learn what the hell the person is asking when their first search wasn't quite right, and you can act on *that*.

Not tracking users in any way is likely gonna be a worse product, *except* for the lack of data stored on servers.

HoppyBeerKid2 karma

But you can do that first system without it being devolved to the single user, right? There will be other factors on top of that, but you don't necessarily need every single thing to be uber personalised for it to work for you.

BadgerMcLovin1 karma

Has anyone ever read your privacy policy?

HoppyBeerKid3 karma

For sure, we like to keep it very short: https://www.mojeek.com/about/privacy/

One of the tactics that is used by some entities on the Web is to make the privacy policy so long that no-one will ever touch it; we understand that people's time is valuable, so it sits at around 2mins max.

JNathanielSmith1 karma

Did you get your name from "Listening Wind" by Talking Heads?

HoppyBeerKid2 karma

It is not a Talking Heads reference, it's actually a switched-up version of the word "logic" that our founder came up with way back when a name needed to be picked. This being said, a decent amount of the team are into Talking Heads - personally Remain in Light is my favourite one of their albums.

From below. We get this one a lot and our founder was completely unaware of this song when the name was picked.

cat-eating-a-salad1 karma

Is this just a thinly veiled attempt at advertising your browser?

HoppyBeerKid4 karma

We are a search engine.

SummonTarpan1 karma

How is babby formed?

HoppyBeerKid2 karma

How is babby formed?

Maybe you should ask a search engine, I happen to know of one.

molrobocop1 karma

Where would you give yourself the edge? For example, I use bing for torrents and porn. Google for everything else.

HoppyBeerKid6 karma

We've had before someone suggesting the strapline of "rediscover the web that Google disappeared" as well as "rediscover the web and escape your filter bubble!"

MrFeles1 karma

[deleted]

HoppyBeerKid2 karma

I have no clue what this is a reference to, but I would say to him to get the cataracts fixed as soon as possible, that stuff can get real bad.

majorjoe231 karma

Are you really not tracking us?

HoppyBeerKid1 karma

For real.

TizardPaperclip1 karma

Do you plan on ever releasing a version of your search engine with an actual marketable name that regular people will tell their friends about?

Or are you content with serving a niche market of geeks who don't care about that sort of thing?

I think DuckDuckGo already has the unmarketable market cornered, unfortunately : (

HoppyBeerKid2 karma

We all love a challenge.

249ba36000029bbe97491 karma

Where did the name come from?

Why isn't there a link to the search page in the original post?

HoppyBeerKid2 karma

It's a switched-up version of the word "logic" that our founder came up with way back when a name needed to be picked. On the latter part, good point, I'll chuck one up now.

oxipital-38 karma

Uh. How sad do you have to be to change your usage agreement as a marketing tool?

Hi we’re Steam! We’ve updated our terms and conditions! Ask Us Anything!

HoppyBeerKid16 karma

Not a marketing tool - a core part of what we do and have done for 15 years and we only just noticed a week ago that the 15th anniversary of that was today! Praise Archive.org!