pexavc@lemmy.world to

Open Source@lemmy.mlEnglish · 2 years ago

A NSFW detector with CoreML

71

A NSFW detector with CoreML

pexavc@lemmy.world to

Open Source@lemmy.mlEnglish · 2 years ago

GitHub - lovoo/NSFWDetector: A NSFW (aka porn) detector with CoreML

A NSFW (aka porn) detector with CoreML. Contribute to lovoo/NSFWDetector development by creating an account on GitHub.

Other samples:

Android: https://github.com/nipunru/nsfw-detector-android

Flutter (BSD-3): https://github.com/ahsanalidev/flutter_nsfw

Keras MIT https://github.com/bhky/opennsfw2

I feel it’s a good idea for those building native clients for Lemmy implement projects like these to run offline inferences on feed content for the time-being. To cover content that are not marked NSFW and should be.

What does everyone think, about enforcing further censorship, especially in open-source clients, on the client side as long as it pertains to this type of content?

Edit:

There’s also this, but it takes a bit more effort to implement properly. And provides a hash that can be used for reporting needs. https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX .

Python package MIT: https://pypi.org/project/opennsfw-standalone/

You must log in or # to comment.

Chat

Daniel Quinn@lemmy.ca
link
fedilink
arrow-up
3·
2 years ago
It’s an interesting idea, and given the direction some economies are moving (looking at you EU & UK) something like this is likely going to feature whether we like it or not. The question for me however is what is the nature of the training data? What some places consider “porn” (Saudi Arabia, the Vatican, the US) is just people’s bodies in more civilised places. Facebook’s classic “free the nipple” campaign is an excellent example here: why should anyone trust that this software’s opinion aligns with their own?
- pexavc@lemmy.worldOP
  link
  fedilink
  arrow-up
  1·
  2 years ago
  Yeah. Have been thinking of this exact scenario. How to create solutions around anything that might “filter” while respecting the worldviews of all. I feel the best approach so far, is if filters are to be implemented. It should never be a 1 all be all and should always be “a toggle”. Ultimately respecting the user’s freedom of choice while providing the best quality equipment to utilize effectively when needed
BrikoX@lemmy.zip
link
fedilink
English
arrow-up
14
arrow-down
11·
edit-2
2 years ago
2 of them are lincensed under BSD-3, so not open source. The the 3rd one uses Firebase, so no thanks.

Edit: BSD-3 is open source. I confused it with BSD-4. My bad.
- wildbus8979@sh.itjust.works
  link
  fedilink
  arrow-up
  9
  arrow-down
  1·
  edit-2
  2 years ago
  How is BSD-3 not open source? I think you are confusing “Free/Libre” and Open Source. BSD-3/MIT licenses are absolutely open source. GPL is Free/Libre and Open Source (FLOSS)
  - BrikoX@lemmy.zip
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    12·
    edit-2
    2 years ago
    It’s not by OSD definition. Having code source available =/ open source.
    
    And most Lemmy clients I have seen use GPL or AGPL licences, so they couldn’t use code licensed under BSD.
    
    Edit: This is incorrect. I confused it with BSD-4. My bad.
    - wildbus8979@sh.itjust.works
      link
      fedilink
      arrow-up
      17·
      edit-2
      2 years ago
      What in the BSD-3 license goes against OSD exactly?
      
      You are clearly confused. The BSD-3 isn’t only “having the source”, it gives you the right to package, distribute, and modify the source code at will. What it doesn’t have compared to the GPL is protections from someone not sharing their modifications (for example when used in closed source products). In that sense it is more “freedom” than the GPL, but that freedom comes with a cost to the community, and in a sense the freedom afforded to the original author.
      
      It is literally approved by the OSI itself: https://opensource.org/license/bsd-3-clause/
      
      And yes, BSD-3 libraries are compatible with the GPL: https://fossa.com/blog/open-source-software-licenses-101-bsd-3-clause-license/
      
      Is there a confidently wrong community on Lemmy yet?
      - BrikoX@lemmy.zip
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1·
        edit-2
        2 years ago
        You are correct. I’m sorry, I confused it with BSD-4 as that used to be the 3rd clause. I updated my post and thank you for calling me out.
        
        wildbus8979@sh.itjust.works
        link
        fedilink
        arrow-up
        2
        arrow-down
        1·
        edit-2
        2 years ago
        That’s still wrong though. The BSD-4 is literally FSF approved. It’s just not GPL compatible and not technically OSI approved. But only on a technicality. The only difference between BSD-3 (BSD New) and BSD-4 (BSD Old) is the advertisement clause. It has nothing to do with redistribution, packaging, or modification of the code. OSI doesn’t agree with the advertisement clause so it’s not officially approved, doesn’t mean it isn’t Open Source.
        
        BrikoX@lemmy.zip
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1·
        2 years ago
        That’s where I disagree. While it’s true that the only difference is the GPL complience it’s definetely against the spirit of open source and OSD. So it is source available license, but calling it open source is a stretch. The simple fact that it renders it unsable for GPL projects go against what open source stands for.
        
        wildbus8979@sh.itjust.works
        link
        fedilink
        arrow-up
        1
        arrow-down
        2·
        edit-2
        2 years ago
        True as that maybe be, your original statement “BSD-4” is not open source is still completely wrong, plain and simple. BSD-4 is not just having access to the source, it gives you significant rights over the source as well. The incompatibility lie with a technicality, an inconvenient one, but a technicality nontheless. Even the FSF agrees.
      - Jomn@jlai.lu
        link
        fedilink
        arrow-up
        2·
        2 years ago
        Yes it exists: !confidently_incorrect@lemmy.world
    - pexavc@lemmy.worldOP
      link
      fedilink
      arrow-up
      1·
      edit-2
      2 years ago
      good point, but was just providing samples. I myself would gladly create a simple package for inferencing using a properly licensed model file.
      
      Edit: Linked a MIT keras model for instance, also thanks for the tip didn’t know about GPL / BSD relationship
      - wildbus8979@sh.itjust.works
        link
        fedilink
        arrow-up
        4·
        2 years ago
        That person is wrong about the BSD-3 license, so it’s not a very good “tip”.
        
        pexavc@lemmy.worldOP
        link
        fedilink
        arrow-up
        2·
        2 years ago
        oh i see, just saw your other comment as well
- Scrubbles@poptalk.scrubbles.tech
  link
  fedilink
  English
  arrow-up
  4
  arrow-down
  13·
  2 years ago
  By definition you can’t have some of these things open source, CSAM/NSFW detection needs to be closed source because people are constantly trying to get around it.
  - MinusPi (she/they)@pawb.social
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    3·
    2 years ago
    Security through obscurity doesn’t work. These systems need to be actually robust, which is only trustworthy with open source
    - Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      8·
      2 years ago
      That is literally not the problem, it’s not security. It’s obfuscation on purpose so things can’t be reverse engineered. I agree with you in most other cases, but this is one I don’t. It’s the same reason there aren’t public hash lists of these vile images out there, because then the people out there will change them. Same with fuzzy hashing and other strategies, these lists and bits of code must remain private so they aren’t tipped off to their stuff tripping the content.
      
      This can’t be a cat and mouse game all the time when it comes to CSAM, it must work for a while. So I’m fully on board with keeping it private while we can, it’s the one area I am okay with doing that. If it’s open bad actors will just immediately find a way to get around detection and all modes of knowing it will be obsolete until we find another way, and in that time we’re waiting to find another way they’re going around posting that shit everywhere, then it doesn’t matter how open source Lemmy is, because all of our domains will be seized.
      - OhNoMoreLemmy@lemmy.ml
        link
        fedilink
        arrow-up
        6·
        2 years ago
        Because any detector has to be based on machine learning you can open source all code providing you keep model weights and training data private.
        
        But there’s a fundamental question here, that comes from Lemmy being federated. How can you give csam detecting code/binaries to every instance owner without trolls getting access to it?
        
        Some instances will be run by trolls, and blackbox access is enough to create adversarial examples that will bypass the model, you don’t need source code.
        
        Scrubbles@poptalk.scrubbles.tech
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1·
        2 years ago
        That discussion is happening, right now the prevailing idea is that it’s an instance admin opt-in feature, where you can host it yourself or use a hosted tool elsewhere to prevent it. on top of that, instance admins should be allowed to block federating images, so things uploaded on other instances are not federated to us and instead those images are requested directly from your instance. That would help cut down on the spread of bad material, and if something was purged on the home instance it could be purged everywhere
        
        toothbrush@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        4·
        edit-2
        2 years ago
        Just chiming in here to say that this is very much like security through obscurity. In this context the “secure” part is being sure that the images you host are ok.
        
        Bad actors using social engineering to get the banlist is much easier than using open source AI and collectivly fixing the bugs when the trolls manage to evade it. Its not that easy to get around image filters like this, and having to do wierd things to the pictures to be able to pass the filter barrier could be work enough to keep most trolls from posting. Using a central instance that filters all images is also not good, because now the person operating the service is now responsible for a large chunk of your images, creating a single point of failure in the fediverse(and something that could be monetised to our detriment) Closed source can not be the answer either because if someone breaks the filter, the community cant fix it, only the developer can. So either the dev team is online 24/7 or its paid, making hosting a fediverse server dependent on someones closed source product.
        
        I do think however that disabling image federation should be an option. Turning image federation off for some server for a limited time could be a very effective tool against these kinds of attacks!
yokonzo@lemmy.world
link
fedilink
English
arrow-up
13
arrow-down
34·
2 years ago
I’m always against censoring made content because it’s lead to a pretty puritanical society we live in
- Scrubbles@poptalk.scrubbles.tech
  link
  fedilink
  English
  arrow-up
  31
  arrow-down
  4·
  2 years ago
  Sweet, so us instance owners are literally fighting off a CSAM attack here on lemmy, and a lot of instances have shut down because of it. So, you’re stepping up to host now and will deal with the feds and the laws that are broken? Great, thanks.
  - newIdentity@sh.itjust.works
    link
    fedilink
    arrow-up
    4·
    2 years ago
    Depends on the country you live in but in most countries there are laws to protect you against abuse. Personally I wouldn’t host anything for anyone except myself because then I would need to give everyone my full name and home address, because that’s the law in Germany.
    
    There still is the moral aspect of it though.
  - yokonzo@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    10·
    edit-2
    2 years ago
    Oh screw off with your reactionist attitude, the poster asked how people felt about an action and I answered, no need to get all butthurt
    - Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      2·
      2 years ago
      Right, and I said that if you’re so against it then you set up and run you’re own instance. I don’t think that’s very reactionist personally, I’m telling you to go do it yourself if you believe so much in it. Until then I don’t really care about how you feel about deserving free speech or censorship when you’re not willing to host it yourself. (There are tools that have one click deploy online, all they need is a credit card, feel free to get started today, tell everyone you allow all of the content)
      - yokonzo@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        7·
        2 years ago
        See that last line right there, that’s the kind of snarky shit I’m talking about, go touch grass bud
        
        Magic Blue Smoke@frogdrool.net
        link
        fedilink
        arrow-up
        5·
        2 years ago
        @yokonzo @scrubbles
        
        I wouldn’t lose any sleep over this one, Scrubbles.
        
        it’s the same attitude the cat has for being fed on time. he has no concept of what is required for him to have the thing he demands, he just demands it loudly.
        
        Scrubbles@poptalk.scrubbles.tech
        link
        fedilink
        English
        arrow-up
        4·
        edit-2
        2 years ago
        lol nope, I think this one is hilarious. Wants unlimited free speech and to never be censored but will do nothing themselves to support it, just demands it of other people. The only way to be truly free is to host yourself, and as we’re seeing people will go out of their way to still try to ruin that too.
        
        Thanks Magic, and saw your profile is Seattle, hello fellow Seattle-ite :)
- jmcs@discuss.tchncs.de
  link
  fedilink
  arrow-up
  21
  arrow-down
  3·
  2 years ago
  Cool, I assume you are volunteering to go through all the pornographic content and make sure it doesn’t contain minors, and that it only involves people that consented to it, and take all legal responsibility for hosting and serving any illegal content? Great, I’ll contact all the Lemmy admins for you.
  - toothbrush@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    5
    arrow-down
    8·
    edit-2
    2 years ago
    IANAL, but as far as I know you dont have to proactively remove illegal content, just the stuff you were made aware of.
    
    So all this drama about federating illegal content is very much overblown.
    
    Edit: sorry about calling it “drama”, didnt know the full extent of whats currently happening. (malicious users spamming CP)
    - Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      2·
      edit-2
      2 years ago
      Seeing how I had to become very knowledgeable because I’m an instance owner in the last few hours because of Lemmy and bad actors, this is absolutely not true.
      - toothbrush@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        1
        arrow-down
        2·
        2 years ago
        Im curious: What are the legal duties of a fediverse hoster regarding illegal content currently? Do you really have to remove illegal content proactively? Because as far as I know, thats just in the EU and only if you are one of the major digital services(which fediverse server hosters arent)
    - 𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one
      link
      fedilink
      arrow-up
      4·
      2 years ago
      I’m pretty sure the mods and admins of lemmyshitpost are fully aware of the illegal content being reported to them in one of the most popular Lemmy tiddeRverse communities, so i’m not entirely confident the ‘proactive removal’ info is relevant in this situation.
      
      If 10 volunteers can’t keep up with it, most of which have now quit, I find it really hard to see this as “drama” personally. I see it as a serious issue which has real life consequences for both the instance owner (risk of being raided) and the moderators subjected to reviewing it.
      
      I suspect you wouldn’t describe it as overblown if you were in the same situation as the mods. I occasionally sift through the modlog and there are occasionally some seriously vile takes in there, spam posts and abuse removed by these volunteers on a daily basis, all to keep our feeds clean. Add traumatic content on top of that too, and it’s no surprise some mods have left and they’ve shuttered the comm.
      
      Apologies if I come off as abrasive in this comment in general, but I just vehemently disagree with the take that this is just some “overblown drama”
      - toothbrush@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        3·
        2 years ago
        Ah sorry, I didnt know that there is an attack going on currently, i just saw a bunch of posts about lemmy being illegal to operate because of the risk of CP federation. And then this post which seemed to imply that one needs constant automated illegal content filtering, which as far as i know isnt required by law, unless you operate a major service that is reachable in the EU, and fediverse servers arent major enough for that.
        
        Scrubbles@poptalk.scrubbles.tech
        link
        fedilink
        English
        arrow-up
        6·
        2 years ago
        Yeah on top of that it sounds like the people who did see it are pretty shaken up, apparently it was real fucked up. So not only blocking it from ever hitting the servers for legal reasons, but on top of that just so no one needs to see that. There are third party tools that will analyze it and block it automatically, and we’re hoping to get those online quick
    - Gamey@feddit.rocks
      link
      fedilink
      arrow-up
      4
      arrow-down
      1·
      2 years ago
      Yea, just leave the CSAM till someone reports it, great solution!
      - toothbrush@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        2
        arrow-down
        2·
        2 years ago
        Well, thats how it generally worked as far as I know. Im not saying that you can host illegal stuff as long as no one reports it. Im saying its impossible to know instantly if someones posting something illegal to your server, youd have to see it first. Or else pretty much the entire internet would be illegal, because any user can upload something illegal at any time, and youd instantly be guilty of hosting illegal content? I doubt it.
- m-p{3}@lemmy.ca
  link
  fedilink
  arrow-up
  1·
  2 years ago
  Censoring != tagging
- Daniel Quinn@lemmy.ca
  link
  fedilink
  arrow-up
  1·
  2 years ago
  I hear you, and if love to live in a world where people could be trusted not to be assholes when posting things on social media, but sadly we don’t live in that world. In this world, people post some Truly Evil Shit, and this presents a problem when a platform scales.

Open Source@lemmy.ml

opensource@lemmy.ml

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !opensource@lemmy.ml

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

268 users / day
1.37K users / week
3.06K users / month
10.2K users / 6 months
76 local subscribers
40.4K subscribers
2.33K Posts
37.4K Comments
Modlog