HomeTechnology"Not a big danger": how the French at Wikipedia are handling the...

“Not a big danger”: how the French at Wikipedia are handling the arrival of ChatGPT

If the community encyclopedia tracks prohibited uses of ChatGPT, artificial intelligence could help improve Wikipedia, estimates Capucine-Marin Dubroca Voisin of Tech&Co.

Will generative artificial intelligences eventually write their own Wikipedia page? Netizens have already stained “wild” uses of ChatGPT to generate series of articles on the participatory encyclopedia. Questioned by Tech&Co, the president of the Wikimedia France association affirms that for the moment it is easy to detect them.

“ChatGPT is a help tool for vandals, but used roughly, it’s not a big danger for us,” says Capucine-Marin Dubroca Voisin.

Hallucinations, a persistent problem

The president of the French “chapter” recognized by Wikipedia says she is open to using AI to power the site, but rules out ChatGPT. “In fact, ChatGPT is very good at simulating human language, although it’s still not perfect, especially since it was trained on Wikipedia, and we have a specific kind of language, a bit cold, that pretends to be neutral,” she explains. . “But it can’t reproduce what we need, which is to reliably cite sources. So it can be generative AI to help us do that, but not necessarily ChatGPT-style.”

Above all, he points out that the chatbot developed by OpenAI still tends to “invent information”.

Diversify the content

The founder of the participatory encyclopedia, Jimmy Wales, believed last month that Wikipedia could use AI for different purposes, such as detecting duplicate and contradictory statements, or identifying blind spots and biases in news coverage.

The use of AI in this context is not new, recalls the head of Wikimedia France: “A few years ago, a company created an AI to generate biographies of women, in particular for the English version of Wikipedia. We also have collaborators who code the generation of biographies, such as Roland45, who created a script to generate a thousand biographies of women scientists semi-automatically, which we then reviewed manually.

More recently, the Sans PagEs project aims to reduce gender bias on French-language Wikipedia, where “between 90 and 100%” of people cited in articles on important topics such as philosophy, science or history are men, and where More than 80% of the biographies are of men, which corresponds to the profile of the majority of the volunteers who contribute to the collaborative encyclopedia.

The Wikipedian also mentions the possibility of using AI to “compare the different language versions of Wikipedia” in order to add missing data in some. He notes that there are already tools to identify these variations, such as WikiData, the database that organizes all of Wikipedia’s data.

Allocation and equivalent distribution

As we have seen, generative AIs massively use Wikipedia content to generate their responses. This use is authorized because the articles in the online encyclopedia are covered by the Creative Commons license (BY-SA) that authorizes the free reproduction, distribution and modification of the content, explains Capucine-Marin Dubroca Voisin. But this license works under two conditions:

1- attribution, that is, indicate the author and the source (which is currently not the case with ChatGPT);

2- share in equivalent conditions. This last point is subject to debate, while the question of authorship of an image or a text produced by an AI has not been decided.

More generally, the concept of “fair use” or “fair use”, which exists in the United States, means that all elements of the web, including those protected by copyright, could in theory be collected to train an AI , recalls the site in an article broadcast by Numerama.

evolutionary information

But generative AI could also touch the heart of the Wikipedia model, whose reliability and relevance depend on an army of volunteer contributors constantly editing and renewing its pages. If the information is accessible in a simple question to a chatbot, “without traceability, without the possibility of modifying it”, “we have the risk of top-down information, which is no longer managed by a community”, advances the president of Wikimedia France.

Author: lucia lequier
Source: BFM TV

Stay Connected
16,985FansLike
2,458FollowersFollow
61,453SubscribersSubscribe
Must Read
Related News

LEAVE A REPLY

Please enter your comment!
Please enter your name here