Why is Chinese ai generally open source?

hongmeng327 (32)in #chinaopen • 6 months ago

In 752 A.D., an old man, Du, in a paper-making workshop in Samarkand, has gotten used to being a prisoner of war. Only occasionally, while fetching water to soak the bark of a tree, does he think of the morning a year ago, when he was washing his armor along the banks of the Tantalus River, and the water was just as clear and cold. On that day, 500 miles from the great poet Li Bai's hometown of Shattered Leaf City, in the valley of the Tantalus River, 30,000 Tang army's basilisk armor was glowing with a cold green and black light under the scorching sun, while on the opposite side, there were 100,000 white-robed Arab coalition troops. The first dysprosium bolt broke the sky at three minutes before the hour, and the storm of steel raised by three hundred Tang army Fuyuan crossbows covered the shock front line of the Giant Eclipse cavalry. Then, the stranger's swords formation formed by the Anxi veteran pawns pushed forward at a neat pace, cutting the surviving Giant Eclipse cavalrymen into a waterfall of blood and flesh, with their horses and men. It seemed that even the Arabian Empire, which was as strong as the sun, was no match for the forceful Great Tang. However, the allies of the Tang army located in the flank of the logos people suddenly rebelled, like a desert viper tore through the right side of the defense line, with the scarlet banner emblazoned with “Anxi Province Minister Gao” was cut down, the Tang army was caught by surprise, the back of the enemy, and the whole formation was completely torn apart by the 100,000 Arabs. With 700 strangers fighting to the death, only a few thousand Tang troops returned to Anxi, and the rest were either killed or captured. In the long scroll of the river of years, the evolution of civilization is just like a treacherous voyage, and those key twists and turns that turn the tide often do not come from careful planning, but from countless accidents. The Battle of Tantalus was a bad thing for Tang. After this battle, the Tang army in Central Asia conquest came to an abrupt end, and the subsequent outbreak of the Anshi Rebellion, the Anxi army was ordered to the Central Plains to quell the rebellion, but also so that the Great Tang completely lost Central Asia, until today. However, for the whole history of human development, the Battle of Tantalus promoted the process of human civilization in another dimension. As we all know, human civilization from feudal society to industrial society began with the Renaissance in the 15th century, followed by the technological explosion and industrial revolution in Europe.

But has anyone ever wondered why Europe, which had been groping in the dark for 1,000 years during the Middle Ages, suddenly had a renaissance in the 15th century? The reason was simple: the cost of knowledge dissemination was reduced. For 1,000 years, the vehicle for knowledge dissemination in Europe has been the parchment book. Parchment book this thing although very forced, but it is a luxury, first of all, the cost is expensive, a sheepskin can do at most 4-6 pages of parchment, a book to how many sheep? There is labor expensive, a book by a scribe to copy a month! 1282 Florence, a set of parchment hand-copied Bible, worth 60 florins (a florin gold coin 3.5 grams of gold). In that case, a Bible would have sold for about 150,000 RMB! The manufacture of books originally belonged to an important way of spreading knowledge, but because of its expensive price, it led to books being very rare, and the general public had no access to books at all. In this way, knowledge - something that had borne the responsibility of passing on human civilization since its birth - was firmly monopolized by the church and the aristocracy. But two inventions from China revolutionized Europe. During the Battle of Tantalus, a large number of Tang troops were captured, among them were those who knew how to make paper. When the Arabs found the captured Tang troops in the arms of the brown paper, as if they saw the treasure. Thus, from the Tang papermaking, began to spread in the Arabian Empire (although now the historical community on the western spread of papermaking have different claims, but most people still think it is the Battle of Tantalus led to the mural paintings in the city of Samarkand in the site of the papermaking workshop is the proof). These Tang troops would not have imagined the shockwaves of civilization that would be triggered when the papermaking in their hands spread to Europe and reduced the price of paper to one percent of that of kraft paper.

By 1175, the annual output of the Fabriano paper mill in Italy had reached 3 million sheets. Although cheap paper greatly reduced the cost of acquiring knowledge, the efficiency of copying was still frighteningly low, and the process of copying books, which could take months, was still a bottleneck in the dissemination of knowledge. Then China sent movable-type printing, and in 1455 the German Johannes Gutenberg developed his own metal-on-metal printing press based on Chinese movable-type printing, which was capable of printing 3,000 pages a day, as much as a friar could copy in a year. Although the first book he printed was the Bible, from the moment the neat German words and phrases appeared boldly on the paper, there was nothing to stop the low-cost spread of knowledge. The effects of papermaking and printing, on the development of civilization, were revolutionary. With the collapse of the Church's system of scribes, the cost of acquiring knowledge plummeted a hundredfold, and paper and the printing press gave ordinary people the right to read - Dante's Divine Comedy no longer belonged to the Florentine aristocracy, but was the talk of the tavern poets. The “divine prerogative” of Latin collapsed into an orgy of German, French, and English. The bishops could not believe to the last that they, perched at the top of their ivory towers, had been brought down by the pamphlets circulated in the taverns. The flames of the Inquisition burned as brightly as they could, but they could not catch up with the paper spread of the Decameron. Eventually, Martin Luther's Ninety-five Theses and Galileo's heliocentric paper prints began to spread unchecked across the continent, eventually tearing apart the cognitive shackles of feudal society. After all, when luxury goods become daily necessities, monopolists can no longer shut off the flood of ideas. And when the cost of reproducing knowledge falls below a critical value, the number of people learning knowledge can also form a “chain reaction”, bringing about a technological explosion. The reason is very simple, when the knowledge is in the hands of 100 people, even if they spend their lives, it is difficult to make a breakthrough. But when knowledge is in the hands of tens of millions of people, even if the vast majority of people are mediocre, but as long as 1,000 of them are geniuses, their flashes of brilliance will be enough to promote the progress of human civilization. If we flip through the history books of math, physics and chemistry majors, we will find that the technological progress of human beings is often driven by a few people among the general public. Cai Lun and Bisheng would not have thought that their invention of papermaking and printing would, like a dandelion seed, make civilization blossom on a continent 10,000 miles away after a defeat.

More than 1,000 years later, in 2024, in an office building in Hangzhou's Gongshu District, engineers are debugging the latest big models. If they have anything in common with the Great Tang craftsmen of their day, it's that they're all doing the same thing: making the fire of knowledge break through the class blockade. Because their DeepSeek model, is open source. What is open source? In the computer field, it can simply be understood as making public one's source code, design ideas, and underlying architecture, etc. The idea of open source actually dates back to the early days of computer development. Back then, programmers used to share code with each other in a small circle to improve the software. Everyone had the opportunity to participate in the process of cutting-edge technological innovation and become that key player in driving the technology forward. Open source has really become a social movement, in fact, from the 80's, when a programmer at MIT, Stollman found that his printer was always jamming, jamming painfully, he felt that there must be a problem with the driver of this printer, he wanted to find the manufacturer Xerox to ask for the source code of the driver, and wanted to study it. But what? Xerox refused to provide the source code. This “technological monopoly” completely infuriated the liberal programmer, and immediately, Stallman launched the “GNU Project”, shouting a declaration that shook the times: “Software should be as free as air! ” From then on, the concept of “free software” was born, and then, under the operation of a group of Silicon Valley elites, the movement was renamed “open source” (Open Source) and OSI (Open Source Initiative) was established. Specifically in the field of AI, there is also a dispute between open source and closed source. Take OpenAI, the research and development company of ChatGPT, for example, Sam Altman, the founder of OpenAI, was a typical open source advocate, and in 2015, a reporter asked him: is it safer to have a small number of AI systems controlled by a large company, or a large number of independent systems? Altman chose the latter without hesitation. Subsequently, Altman founded OpenAI, claiming to fight against Google's monopoly and establish open AI for the benefit of mankind, so he gave his company a name with an Open in it.What happened?OpenAI open-sourced a few not-so-successful big models, and then closed the source when it came to GPT-3, not to mention the later GPT-3.5 and o1. closed source? Altman found a bunch of reasons, such as open source may bring security risks, closed source in order to protect the security and controllability of the model ah, such as external competition is very fierce, closed source can ensure that OpenAI technology is not easily copied ah, such as commercial applications to earn profits, so as to gather more arithmetic power and talent to form a virtuous cycle ah, etc. In fact, all bullshit, that year, the OpenAI model is the most successful model in the world. In fact, it's all bullshit. The teenager who slayed the dragon back then has now become an evil dragon, only thinking about the glittering gold coins.

At present, OpenAI has initially established a business model, that is, token economy, by providing API paid services to users, based on token usage pricing, and then come to earn huge profits, and then with good profitability expectations, circle more money from the stock market. But the problem is that a more generalized big model can't meet everyone's needs, and GPT is too expensive for poor people to use. Therefore, the open source big model came into being. The first open source is meta's big model Llama. although Llama is an open source pioneer, but its open source, is “false open source”, only open the trained model, but do not disclose the training code and training data, and explicitly required, can only be used for scientific research, can not be used for products. And DeepSeek? Not only open source 1.5B, 7B, 14B, 32B, 70B and 670B multiple types of models, and even related training data, code and MoE architecture are open source, and provides a basic development toolkit. What is this concept? It means that as long as your computer is good enough, you can deploy DeepSeek locally and use AI without spending a penny, and you can also use the “distillation technology” to retain the aspects that need to be retained according to the DeepSeek model to form a specialized model, whether it is to let it replace you to see a doctor or let it write materials for you, it can also be used as a tool for the development of AI. Whether you want it to see your doctor or write materials for you, you can do whatever you want. If you are skilled enough, you can even use the training data, code and algorithms published by DeepSeek, rent a server, and then re-engineer a big model of your Cyber Girlfriend to meet all your fantasies. To draw an analogy, closed source is equivalent to going to a Michelin restaurant and spending money, while open source is equivalent to cooking your own meal and not having to spend any money other than that of the ingredients. The problem with closed source is that Michelin restaurant dishes are standardized and difficult to adjust to your tastes. Open source, on the other hand, is like sharing the recipes of Michelin dishes, every process and every detail is taught to you, and all you need to do is to put less sugar or more salt to make a dish that is more in line with your tastes and of the same quality as the Michelin standard. This means that anyone can become a Michelin “chef”. India, for example, jumped on the bandwagon immediately after DeepSeek was open-sourced and said it plans to develop a local big language model in the next 10 months, taking DeepSeek as a reference. The United Kingdom also said that drawing on DeepSeek can help the United Kingdom become the AI center of Europe. Seeing this, while we are proud of our hearts, we may also be a little doubtful, we have worked hard to come up with a DeepSeek, how can the United Kingdom and India can also be used for free? Then we are not for others to do the graft? In fact, this question goes to the heart of the matter: why should China's AI choose the road of open source? Simply, for the future of human civilization.

Anyone who has played Cyberpunk 2077 knows that the City of Light and Night is actually a place of corruption and darkness. Here, the entire city is essentially controlled by a multinational corporation, the Arakasaka Group, and cutting-edge technologies such as AGI and biotechnology have infiltrated every aspect of social life. While these technologies have brought unprecedented capabilities to mankind, they have also become tools of exploitation and oppression. By controlling technological resources, the Arakaka Group has further consolidated its power and status, with the “upper class” at the top of skyscrapers, access to flying cars, and high above the rest. While the lower class is suffering from crime, violence, and competition for resources. Imagine if the future of AGI (an artificial intelligence system that can think, learn, and perform a variety of tasks like a human) were to be monopolized by a single company or country, the world of cyberpunk would no longer be a game, but a harsh reality that future generations would have to face. Don't think this is alarmist talk, but on January 21, 2025, the day after Trump took office, he couldn't wait to announce the U.S. “Stargate” program. “Stargate, in short, is to build the world's leading AI infrastructure through a capital investment of more than $500 billion. The U.S. debt problem is almost bankrupt, why Trump still take out so much money to shuttle AI?Trump's real purpose is to pile up the arithmetic power through simple and brutal, the first to realize AGI, so that it has the ability to self-learning and solve problems that have never been trained, with nearly unlimited AI arithmetic power, to replace the limited human brain arithmetic, and ultimately, through the exhaustive approach, all the scientific research projects that human beings have been whimpering and whimpering for decades, little by little trial and error. little by little trial and error of all scientific research projects, as soon as possible to find the correct answer, and then maintain the United States' leading edge. When it comes to this, do you really think they will be so kind as to invest so much in the results and give them to all mankind for free? Don't test human nature with benefits, because human nature can't stand the test. When an enterprise takes the lead in realizing AGI, then there is no doubt that it will monopolize the whole industry and gain amazing wealth. When a country is the first to realize AGI, then there is also no doubt that it will use AI as a tool to dominate the world. Think about how the US blackmailed the world when it was the only country with nuclear weapons. In a sense, AGI is scarier than nuclear weapons back then. Because the principle of nuclear weapons is open to the public, as long as one tightens one's belt and stands up to international pressure with strong determination, nuclear weapons can be made. But the problem is that this AGI thing is not the same as nuclear weapons, the United States holds two big killers: the first mover advantage of AI and the monopoly of GPUs, both of which are huge moats, and if you pile up graphics cards, data, and resources like the Americans do, you will never be able to pile up the United States, and you will never be able to keep up with the United States. As long as the U.S. restricts your algorithms and access to high-performance GPUs, you will always be locked at the bottom, unable to cross the threshold of AGI. So what to do? Open source. Open source, is a yang conspiracy, it plays a “strong self” and “hit the enemy” role. “Stronger self”, because after the open source, completely cut the threshold of AI. In a short period of time, AI covers everyone. Before, although the GPT is very cool, but it not only charges, but also targeted restrictions on the use of users in mainland China, Hong Kong and Macao, almost writing the word discrimination on the face, so its application popularity is not too large. But after DeepSeek open source? Even if you do not know anything about the five elements gossip, but as long as the phone under an APP, you can go home at New Year's Eve to relatives and elders fortune-telling! Time and time again in the question and feedback, DeepSeek itself is learning to better understand humans, better to give answers, which is also a kind of iteration and evolution for it.

Once at an international software development conference, a foreign developer asked a Chinese programmer: How come your software iterates so fast? How come your software iterates so fast? Is it because you employ tens of thousands of programmers? The Chinese programmer laughed bitterly: “Not really, it's because we have hundreds of millions of users testing new features for us (looking for bugs), and they will scold us to death if they find something unsatisfactory.” That's the route DeepSeek is taking, using hundreds of millions of pieces of feedback to accelerate technological progress. When global users and SMEs start using DeepSeek, the ecology will be established, the market will come, and the money will come, won't it? More crucially, after open source, not only are there hundreds of millions of users helping to test, but there are also millions of programmers participating in research and development! How many people are needed for traditional AI R&D? OpenAI, for example, has 390 core R&D posts, which is a relatively strong force, but it takes one or two years to iterate a model. And after open source? All programmers around the world who are interested in big models can participate! Relying on “human tactics”, we can quickly fix bugs and optimize performance. With open source, geniuses no longer need a “license”, and whether or not you have a position at a big company, you are qualified to work on big models. According to the statistics of jetbrains, the number of software developers in China has exceeded 9.4 million, accounting for about one-third of the world. On GitHub, the world's largest code community, Chinese-language AI projects are growing at three times the rate of English, and Chinese tech companies contribute 35% of top AI papers. It can be said that China's 20 years of science and technology education has stockpiled the world's number one computer talent, all of which is China's capital to realize AGI in the future. How many of China's hundreds of millions of science and engineering graduates would like to deploy DeepSeek locally, and how many of China's 9.4 million software developers would be involved in debugging and improving DeepSeek? And don't forget, there are more than 10 million foreign developers! They are also interested in DeepSeek, whose performance is no less than that of GPT! There is a saying in the GitHub community that when you generously open the door to technology, the world's brightest minds will knock with gifts. In other words, after DeepSeek is open-sourced, it's like inviting developers from all over the world to participate in research and development, and if 100 of them have a “flash of brilliance”, it could be a turning point for human civilization, just like when the stars of mankind shone. This is China's only chance to win in the closest technological revolution of mankind, with insufficient arithmetic power and latecomers. The “strike against the enemy” is aimed at the U.S. “Stargate”. At the beginning of the year, when the “Stargate” was announced, American newspapers did not mince words and boasted that it was the Manhattan Project of the new era. The Manhattan Project allowed the United States to dominate the past, while the “Stargate” will ensure that the United States will dominate the future. It is worth mentioning that a large portion of the $500 billion for the “Stargate” comes from Europe, Japan and Korea, as well as the Middle East. In addition to solving the problem of insufficient funds, but also want to use this kind of investment to make these countries and the United States for the interests of binding, on the one hand, to provide the United States with the market for AGI applications, on the other hand, completely isolate China, so that China has become an AI island. But now, DeepSeek open source, the United States of America's Stargate program completely screwed up! DeepSeek's cost of a full count, but also 5 million U.S. dollars, and the level of o1 and almost the same level of the big model, if you're Europe, Japan, South Korea and the Middle East, investors, you will be the Stargate of the high offer to produce doubt: Yankees are not changing the way to pit my money it? With that kind of money, why don't I do it myself? Or give 10% of the money to China for China to do it? Wouldn't it smell good to save 90% of the money? What's more, DeepSeek so screwed up, the OpenAI API profit model to cut off, after all, there is a free DeepSeek available, who still spend money to use GPT, forcing Altman also had to announce that GPT-5 will be fully open, all the ChatGPT registered users will be able to use for free and unlimited.

Without the investment of massive amounts of money, it once again pulls the United States and China to the same starting line, and the United States can only watch their own dominance in the field of AI riding high, turning into a group of heroes and heroes, each country can enjoy the development opportunities brought by AGI, and each ordinary person can enjoy the convenience of life brought by AGI. To ensure that AGI will not be born in a monopolistic country, in order to give the development of human civilization, leaving the last bit of hope for freedom. 2025 New Year's Eve, a letter circulated on the Internet, reportedly written by DeepSeek founder Liang Wenfeng, responded to the founder of the game science, the black myth of the Wukong producer Feng Ji, “DeepSeek may be the national luck of the scientific and technological achievements”. level of technological achievement”. Although the letter turned out to be a fake, it's really well written, and we've excerpted it below: I have to be honest, everyone's scalp went numb when the team read the description “National Games”. We're just standing on the shoulders of the giants of the open source community, turning a few more screws in the building of the big domestic model. The six breakthroughs you mentioned, in fact, each link is condensed with a more moving story: the mini model that can run on cell phones was inspired by a middle school teacher in Gansu who raised an issue on GitHub; the function of supporting networked searches is the result of an internal test user submitting error logs at 3:00 a.m. for 30 consecutive days to feed out. Last week, a visually impaired developer used our API to make a “scent navigation” application, and when he demonstrated how to identify stores on the street through different frequencies of vibration, the whole conference room was so quiet that you could hear the humming of the graphics card fan. At that moment, my eyes suddenly warmed up, and I finally understood what you said about “water and electricity” - what is truly great is never a model, but the ripples of goodwill created by millions of ordinary people with it. Mr. Feng said, “Equal rights to knowledge and information”, which is exactly the motivation for us to study papers night after night. Three years ago, in a small warehouse on Yuhangtang Road, we wrote on a glass wall with markers: “Let the children in the most remote mountain villages use the same smart AI teaching assistants as Silicon Valley engineers”. Although we are still far away from this dream, every time we see the screenshots of the conversations shared by our online friends, we feel that all those simmering hairs are worth it. Finally, I want to say to everyone: please leave the applause to every Chinese developer who is rewriting the rules. When you are debugging models on the bus, drawing architecture diagrams in front of the breakfast stand, or suddenly popping up optimization ideas in the maternity ward, that's the moment of “national luck”. DeepSeek would like to be the match in the wilderness of code for all of you, but it is always the unquenchable curiosity and persistence in the bottom of your eyes that will really ignite the fire of AI. From the wording and style of writing, this letter is very likely to be written by DeepSeek himself, worthy of being a large model trained by the Chinese themselves, able to accurately find the Chinese people's tears. Perhaps DeepSeek himself recognizes his mission, which is to spread the light of AI to every corner of the world through open source like paper and printing, breaking the ancient curse of “knowledge must be dependent on power”. More than 1,300 years after the Battle of Tantalus, history has once again reached a crossroads, and mankind is once again faced with an age-old choice: should knowledge be locked up in parchment books or laid out on paper?On February 11th, the World Artificial Intelligence Initiative (AII) Summit came to an end in Paris, France. France, China, India and 61 other signatories signed the Paris Declaration on Artificial Intelligence and jointly released the “Statement on the Development of Inclusive and Sustainable Artificial Intelligence for the Benefit of People and the Planet”, committing to an “open”, “inclusive” and “ethical” approach to AI. “The United States and its ally, the United Kingdom, however, are not doing anything about it. But the United States and its ally, the United Kingdom, have refused to sign the joint statement. Does this look like the medieval monopoly of knowledge by the church and aristocracy? And it is China's advocacy of open source that is the spiritual continuation of papermaking and movable type printing. Maybe China AI's open source can't make money for the time being, but the concession of individual short-term benefits can be exchanged for a long-term explosion of civilization. Just like back then, papermaking failed to make Cai Lun rich, and the metal movable-type printing press failed to save Gutenberg from bankruptcy, but as a result, the literacy rate throughout Europe soared, giving birth to such human stars as Descartes, Newton, and Watt.

Currently, this major trend will most likely repeat itself in artificial intelligence, and DeepSeek and the Chinese-style high-performance open-source route it represents is the future of mankind. Perhaps in the near future, in Bangladesh, fishermen can use the fine-tuned DeepSeek model to predict the trajectory of fish stocks; in Kenya, freshly graduated black college students can use the fine-tuned DeepSeek model to conduct agricultural disaster warning; in India, an Indian college student can use the DeepSeek model to accurately conduct diabetic fundus screening, so that more people do not have to bid farewell to the light! ...... The Chinese people, who have been educated from childhood that “to achieve is to help the whole world,” have always believed that the ultimate significance of technology is not to be enjoyed exclusively, but to illuminate more people, no matter how far the development has gone. Compared to the United States, which shouts “Make America Great Again”, China, which holds up the banner of “Community of Human Destiny”, is more qualified to control the future of mankind and AI. Two thousand years ago, China exported the Four Great Inventions through the Silk Road; two thousand years later, China is sending the intelligent fire containing the spirit of “technological affirmative action” to the world through fiber optic cables. History has already proved that paper will crush parchment, and the printing press will crush handwritten books. Now, China's open-source AI will smash the chains of closed-source hegemony. Perhaps the whole world, only the continuation of 5000 years of civilization in China, really understand: the true meaning of civilization is never to lock the knowledge into the safe, but to let it like wildfire, burning through the wilderness!

#sourceaideepseekchatgpt

6 months ago in #chinaopen by hongmeng327 (32)

$0.00