Combating cultural bias in AI model translation

While AI bias is often the systemic bias that large language models sometimes exhibit against different genders and races, it is also becoming clear that models can be biased by prioritizing one language over another.

In recent years, efforts have been made to curb this preference, with AI model developers such as Google and OpenAI building translation models. Most recently, Google released TranslateGemma on January 15, which was trained in 55 languages and 500 language pairs – languages that can be easily translated from one to another.

However, translation models fail to capture some of the nuances of spoken language. Enterprise AI Platform Vendor Articul8 it says LLM-IQ The agent provides more information about this. The multi-tiered evaluation agent system scores models on five qualitative dimensions: flow and naturalness, coherence, cultural norms, consistency, and clarity.

Along with the framework, Article8 found that many models failed on cultural appropriateness, suggesting that more work is needed to ensure AI technology is ready on a global scale.

Connected:Anthropological Goals for Transparency with Cloud Constitution

In this quiz, CEO and Founder of Articul8 Arun Subramanian Discusses what led to the development of the framework and why it is important to have a culturally appropriate model.

What inspired Articul8 to develop the LLM-IQ agent and why did it focus on the nuances of translation into AI models?

Arun Subramanian: We have customers in Japan and Korea. As we started deploying into those regions, we needed models that really understood multiple languages.

One thing that happened was that when we deployed some of our systems early on, customers were both happy and unhappy.

In Japan and Korea, they told us, “Your answer is accurate, but it’s rude.”

We said, ‘Okay,’ but we didn’t know the difference.

This shows that Japanese has many layers of complexity. This is so in many languages. For example, in English, you are only you. This is neither respectful nor disrespectful. While many languages have ‘you’ for people with whom you are on the same level, if you are addressing elders, superiors, or someone respectable, it is a different word. And those nuances are sometimes understood, but most of the time not.

But in Japanese, there’s another level where the context of what you’re saying is important, like who you’re saying it to, who’s saying it, and what outcome you want to get from that conversation. You can be direct, indirect, polite, extremely polite or a little harsh. Depending on the context, if you use the wrong, say, intonation, that is also considered wrong.

Connected:Opinion: Work with shadow AI – not in opposition

That’s really what attracted us, because it’s more on a linguistic level. Although it is not a technical domain, it is still one domain-specific Language for Japanese.

When we did more research we found that it was very organized. All models were created mainly in languages like English or Latin, and even the models from China completely missed this nuance. The Japanese may be over-represented in terms of digital content. However, they were not trained to capture these nuances.

In what situations would it matter whether the LLM is polite or rude?

subramaniam: : For example, in A supply chainYou never know if someone was making a recommendation or someone gave instructions that would have a deep impact.

Additionally, it can also have serious costs.

If you have an automotive system, it is generating a recommendation. The human in the loop is reading the recommendations. Humans do not know whether a recommendation needs to be acted upon with 100% certainty. This has a deep impact on the industrial environment.

with the rise of sovereign aiWhile more regional AI vendors are addressing local issues with their technology, why should a vendor from outside a country like Japan choose to deal with the language problem?

Connected:AI center responsible for combining research with industry information

subramaniam: : I look at it as someone with global insight versus someone with only local insight. You need to be locally competent, but globally optimistic.

It’s about global education quickly implemented in Japan, with a localization that’s uniquely Japanese. It’s very different because, yes, you know more about localization right away, but imagine you have to work globally with all the data to do what you need to do.

For example, our energy models are based on global datasets. Our local partnerships are based on the manufacturing model of global partnerships. Our research partnership with Meta, our scaling partnership with AWS, all of that comes because we are a global operator. But we also operate with a deep understanding that even though we are global, we have to adapt what we do.

Why do you think LLMs seem unable to understand the nuances of a language like Japanese?

subramaniam: : The biggest drawback is that Data sets are extremely biased. By ‘biased’ I mean an asymmetric distribution of English vs. non-English. Even in Latin languages or Latin-based languages, the distribution is asymmetric: I’m talking 99% to 1%. This doesn’t seem like a minor difference.

Even digitized non-English content comes mainly from the West or from sources to which we do not have access, such as China.

All that politeness, what is considered polite and rude, what is considered almost-natural human interaction has come from the West.

Was there anything special about developing this framework? open source model Does it work better than proprietary models?

subramaniam: : We benchmarked against all open source models and all closed source models. But then we had to build these models from the ground up because we had to balance the data set. If you don’t balance the data set, you will constantly suffer from the same bias.

We have a concept called model mesh, which enables us to orchestrate and decide at runtime which models to call which. We don’t necessarily need a big, general-purpose model that needs to be fine-tuned for every task. We can have task-specific models that are independent and then let them work together as a system. Then the system is a runtime reasoning engine that we can run together.

Yes, we use general purpose models to get information about the world. But then again, when it comes to Japan and the Japanese language, we have our own model.

The second question people will have is, ‘Oh my God, do I need to make massive models for every single task?’

The answer is no, because we end up with a family of models that grow together. If a model does a task really, really well, it somehow impacts and improves across the board.

editor’s Note: This interview has been edited for clarity and brevity

Combating cultural bias in AI model translation

Google releases Conductor: a context-driven Gemini CLI extension that stores knowledge as Markdown and streamlines agentic workflows

SpaceX recently purchased Elon Musk’s CSAM company

Related Articles

Leave a Comment Cancel Reply