Outside of English, ChatGPT makes up words, fails logic tests, and can’t do basic information retrieval.

  • GenderNeutralBro@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    42
    ·
    1 year ago

    ChatGPT is not only made for information gathering, though

    I’d argue that it is not made for information gathering at all, and it is largely coincidental that it performs as well as it does even in English.

    • Kazumara@feddit.de
      link
      fedilink
      arrow-up
      11
      ·
      1 year ago

      Our CIO at work posted a warning about using ChatGPT on sensitive data. The shocking part was that in the set of examples for why we might be using ChatGPT already he mentioned “for performing a quick fact check”, which is insane to me. Who would use the system that is know to just generates likely answers even if they are untrue, for a fact check of all things?!

    • Dudewitbow@lemmy.ml
      link
      fedilink
      arrow-up
      8
      ·
      1 year ago

      Machine learning is only as good as the dataset it has, and given that english has a HUGE data set on the internet, its okay at it, but it makes sense that for other languages, its likely not ideal.

      An example would is art. Look up one using a smaller data set (e.g fully legal ones where all training data had artist permission) vs ones trained on the larger dataset where legality wasnt a concern. Night and day difference

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        5
        ·
        1 year ago

        ChatGPT is actually able to translate the information it learns in one language into other languages, so if it’s having trouble speaking Bengali and such it must simply not know the language very well. I recall a study being done where an LLM was trained up on some new information using English training data and then was asked about it in French, and it was able to talk about what it had learned in French.

          • GenderNeutralBro@lemmy.sdf.org
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 year ago

            That’s an important point you raise. I feel like a big problem with the LLM projects we see today, including ChatGPT, Bard, etc., is that the developers have tunnel vision. Rather than using the LLM as one component of a system with many well-researched traditional algorithms doing what they do best, they want to do everything within the network.

            This makes sense from a research perspective. It doesn’t make sense from an end-product perspective.

            The more I play with LLMs, the more I feel like their true value is as something like “regular expressions on crack”.

        • Dudewitbow@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          Of course. But with translations, brings mistranslations, especially with tier 3+ languages to learn as a english speaker. The data is subject to the accuracy of the translation, and Chat GPT translation is still pretty far from perfect.

          • FaceDeer@kbin.social
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Ah, I had interpreted your comment to mean that you thought ChatGPT wouldn’t know how to answer a question in Bengali unless the information it needed to solve the problem had been part of its Bengali training set. My bad.

    • ecamitor@beehaw.org
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Yeah it feels amazing.

      So much new tech only works in English (for example: image generation like StableDiffusion) and using ChatGPT in your native language and it works great, almost as good as in English (and actually even better for lots of people because they can express themselves better in their native language than English) felt unreal