Uncensored AI Models: the importance of composable alignment in cultural diversity and research freedom.

Andrea Belvedere
3 min readJun 19, 2024

--

An AI model is a machine learning program trained to perform specific tasks, such as answering questions and interacting with users. ChatGPT is a popular example of such models, trained on large amounts of textual data to understand natural language and generate relevant responses. However, the censorship and alignment of these models provoke significant debates in the field of artificial intelligence.

Many AI models, including Alpaca, Vicuna, WizardLM, and others, are designed with built-in alignment. This alignment prevents the model from providing dangerous or inappropriate responses, thus protecting users from harmful information.

In reinforcement learning from human feedback, it is common to optimize against a reward model trained to predict human preferences. Because the reward model is an imperfect proxy, optimizing its value too much can hinder ground truth performance, in accordance with Goodhart’s law. This effect has been frequently observed, but not carefully measured due to the expense of collecting human preference data.

by Scaling Laws for Reward Model Overoptimization

Limits of alignment and the need for uncensored models

Despite the benefits of alignment, there are valid reasons to create uncensored models. Global cultural diversity requires that AI models can reflect a wide range of values and norms. For example, different political and religious factions might want models that respond more closely to their principles. Additionally, alignment can limit the use of AI in creative or academic contexts, such as writing fiction with complex characters or conducting pure research on controversial topics.

https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GGML

Uncensored or unaligned models seem to perform better compared to aligned models like GPT-4, PaLM, and others. Additionally, WizardLM-7B-Uncensored has demonstrated the necessity of uncensored models for scientific exploration, freedom of expression, composability, storytelling, and even humor.

American culture is not the only one that exists. Different cultures might desire models that reflect their specific values. Writing fiction, which can include extreme behaviors for plot development, can be hindered by overly censored models. Academic research or intellectual curiosity about how certain things work, even if dangerous, is different from the intent to commit illegal acts. Users should have full control over the models running on their devices, without restrictions imposed by third parties.

Composable alignment: a balanced approach

Composable alignment suggests starting with a base, unaligned model. Then, build specific alignments based on the needs of users or interest groups. This approach offers the flexibility to adapt models to different contexts and requirements, while maintaining safety and responsibility in the use of AI.

https://arxiv.org/pdf/2210.10760

Composable alignment allows the creation of a flexible base model. It can be adapted to various needs and contexts. It enables users to have greater control over the responses provided by AI models. It promotes cultural diversity and freedom of expression. It fosters responsible and safe use of artificial intelligence.

While alignment of AI models is essential to ensure safe and responsible interactions, it is equally important to consider the need for uncensored models. These models can better respond to the diverse cultural, political, and creative needs of global users. Composable alignment represents an innovative approach that balances safety and freedom, promoting broader and more responsible use of artificial intelligence. Collaboration within the open-source AI community is crucial to creating models that respect both safety and freedom of expression, while ensuring the advancement of knowledge and innovation.

Ref https://www.economymagazine.it/adesso-i-guru-del-digitale-difendono-dalla-censura-lai/

https://arxiv.org/pdf/2210.10760

https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GGML

--

--