The ViGen project develops Vietnamese open-source datasets for use in training and evaluating the capabilities of AI models serving Vietnam.
At the announcement ceremony of the Innovation Challenge 2025, a representative of the National Innovation Center (NIC) said that the program aims to promote the development of artificial intelligence in Vietnam. At the same time, NIC also announced the ViGen project to build large-scale and high-quality Vietnamese data sets.
Datasets are an important foundation for creating large language models (LLMs), before putting them into applications. The higher the quality of the dataset, the more accurate the LLM will be.
According to Mr. Tran Viet Hung, co-founder of AI for Vietnam - the unit implementing ViGen, Vietnamese is used by more than 100 million people, but currently large language models are trained based on less than 1% of Vietnamese data. "That is the reason why the output of current AI models has informational value, but is not natural, does not convey all the value of Vietnamese, so the usefulness is not high, not effective", Mr. Hung said.
The project representative said that it will build a large-scale, high-quality open-source Vietnamese dataset to train and evaluate the capabilities of AI models. This will help ensure that AI development in Vietnam is consistent with Vietnamese cultural values and ethical standards, aiming to build an open-source AI ecosystem that is locally appropriate and responsible.
The project roadmap is for 3 years, until 2027. In 2025, tasks will be carried out on building and developing the data set, before moving towards completion and putting it into application.
ViGen is the result of a three-way collaboration between Meta, NIC and AI for Vietnam. NIC is the project manager, coordinator and ensures that the project is consistent with Vietnam's national goals. AI for Vietnam is the implementing partner. Meta provides technical and financial support. In addition, the group said it will contribute open-source datasets from its AI and Data for Public Interest program, including information on mobility and social connectivity, as well as training data from AI-powered population maps. Some of the project's strategic partners include Nvidia, Viettel and the Vietnam Academy of Science and Technology.
According to Mr. Hung, with the current speed of AI development, if we do not quickly take advantage of it, the opportunity will be missed. Developing an open-source Vietnamese dataset helps projects not spend much time and resources on training and investment. "ViGen's mission is to make AI models support Vietnamese naturally and comprehensively from the core, thereby 'unlocking the potential of artificial intelligence applications in Vietnam'", he said.
The project is assessed by Professor Yann LeCun, Chief AI Architect of Meta, as "not only aiming to promote technology, but also aiming to build a comprehensive AI future, honoring and integrating Vietnam's unique cultural and linguistic heritage".
At the announcement ceremony, Mr. Vo Xuan Hoai, Deputy Director of NIC, emphasized that AI is transforming the world, so developing large-scale, high-quality, open-source Vietnamese datasets for AI training and evaluation has become an urgent priority.
According to him, ViGen is in line with Resolution 57 of the Politburo in promoting breakthroughs in science, technology, innovation and national digital transformation, but requires joint efforts from policymakers, research groups, researchers, developers, experts and users.
"The participating units will turn AI into a powerful tool for all Vietnamese people and make Vietnam a global AI powerhouse," said Mr. Hoai.
This is the third year the Vietnam Innovation Challenge has been organized. From 2022, the program will attract more than 750 solutions from 20 countries and territories each year. Deputy Prime Minister Nguyen Chi Dung said that this is a strategic program to seek innovative solutions worldwide to solve important national challenges, towards a prosperous and sustainable Vietnam.
"For the program to be successful, it requires cooperation between the public and private sectors and domestic and foreign partners to join hands in forming, testing and implementing innovative initiatives for a prosperous Vietnam," said Mr. Dung.
VN (according to VnExpress)