Chapter 1381 Thinking System


Chapter 1381 Thinking System

“That’s how it is.” Zhou Zhi was so happy that his eyes narrowed: “But the word cards provided by the two senior brothers are so useful.”

“"Sea of ​​Words" 》The character cards are numbered and large. The characters include Pinyin, Republic of China Pinyin, and even the old four-corner number, as well as the definitions and examples of small characters, which meet all the requirements of our automatic recognition software for debugging and calibrating the model. Of course we have to use that! ”

Thoughtful He added: "With the help of word cards, we quickly improved the accuracy of the software in recognizing graphic characters from 92%, which means eight out of 100 characters are wrong, to 99% Point six, that is, relying on software to recognize text, only four mispronunciations out of a thousand words have directly improved by an order of magnitude.”

“This achievement has also directly helped us surpass the digital library. The last huge threshold, this software, is truly mature.”

"The digitized "Sea of ​​Words" brought this time is our first digital book established through computer scanning and recognition technology. Since "Sea of ​​Words" with the largest number of words can be digitized, of course we are full of resources for the rest of the classics. Confidence! ”

“That is the digitization work of ancient books Can it be put on the schedule?" Wei Yixin was overjoyed: "Our school is working on the big book "Siku Quanshu Congbi". How about you come to our school to tell me about your student status..."

One side The director of the computer center also nodded repeatedly: "Actually, Tsinghua University has not removed its head, but you are a liberal arts major? Sir, we are here to work on the digitization project of ancient books. Think about it, how big an achievement will it be if this big project is completed? Isn’t that what the ancients said about a gentleman’s three virtues and his meritorious deeds?”

Zhou Zhi laughed secretly in his heart when he heard this. It should be said that the cultural atmosphere of Peking University is indeed good. Look at the plan. The director of the Computer Center could even explain the proposition of "The Three Lis of a Gentleman"...

However, this was not possible for the time being, so Zhou Zhi had no choice but to explain with a smile: "Our Shu University's "Tao Zang" and "Confucian Zang" The two big books are also ready to be studied, and it is impossible for Grandpa Shi to let them go.”

"But now that we have good tools, even if I don't come to Peking University, everyone can speed up the progress of the project. Let's get back to business now. I will use the digital "Sea of ​​Words" to demonstrate our project architecture protocol and the process of organizing digital classics. Standard. "

This thing is Zhou Zhi's own creation. Even in the previous life, this was something that had never been done in the Book Classics Project.

Until Zhou Zhi traveled back through time, the country's work on digitizing classics was still at the relatively primitive application stage of establishing a tree database.

Just like a standard library, a book is stored as a text file, and then the file is compiled with the book title, book number, and at most the author, publisher and other information. , as a label.

The advantage of this is that it is simple and clear, easy to upgrade and maintain, and the information is comprehensively saved. In one sentence, it is enough, but not easy to use.

Such things are of course unqualified in Zhou Zhi's eyes. They can only be regarded as foundations at best, and are still far from being a hundred-foot-tall tower.

This is the case with information engineering. The ideas of algorithms and protocols are often more important than the basic work. If the guiding ideology is wrong, wait until the project progresses to the point where it is no longer sufficient, and then you will think about changing the course and building the structure after you have already built it. In the case of houses, if we continue to carry out the project, we will pay a huge price.

The huge confusion of Chinese encoding in later generations is the best footnote of this lesson.

These ideas were sorted out under the guidance of Clover's R&D thinking and the "how to ask three questions" principle proposed by Zhou Zhi. "How to ask three questions" sounds mysterious, but it is actually very simple and simple. It means that for any needs, we have to ask more than three levels of questions.

For example, the classics digitization project, according to the general idea, is it easy to use?

The answer is definitely: it’s not easy to use.

Then the first thing comes: how to make it easy to use?

The answer is simple: in addition to scanning books into text documents, we must also build a huge tag system and smart search engine on this basis.

Then the second step comes: how to build a huge tag system and smart search engine?

The answer is that we need a tag collection software that can automatically analyze and extract feature tags based on the document content; to complete a smart search engine, we also need a spherical network data topology association system.

Then the third step comes: how do we develop this tag collection software and a ball-net data topology correlation system.

The answer is software and hardware parallelism. In terms of software, we use the latest mathematical statistics principles and introduce them into computer models to develop a set of feature extraction algorithms and feature linkage algorithms, and use this set of algorithms as the guiding ideology. Complete software development.

In terms of hardware, in order to overcome the insufficient computing power of existing computing equipment, we must use the help of the Internet era to maximize the use of all computing resources on the wide area network and develop a distribution system based on the Internet. computing power application system.

This concept is still relatively preliminary, but the digital "Sea of ​​Words" based on this concept is very clear. Zhou Zhi also took the lead in inventing a system of exposition through three-dimensional layers. Explain the complex relationships between various contents on a page of word cards.

Through the illustration, Wei Yixin and Leng Yulong can see that the flat word card has turned into a straw grasshopper cage, or like a folding lantern after it is opened. The various layers are connected through algorithms. The feature labels of each layer are connected to form the three-dimensional structure of the character card.

For people studying liberal arts, it is even easier for people studying liberal arts to understand this system of thinking than those studying science. Professor Zeng from the Computing Center had to explain it carefully, but the two senior fellow students could understand it. Feel more comfortable.

Because in the minds of the two senior brothers, the knowledge system about word cards is just like what Zhou Zhi is showing in the information system now!

Leng Yulong clapped his hands happily: "Wonderful, wonderful! What does it mean to be better than blue? This is called being better than blue! Who could have thought that one day every page of the book would become three-dimensional?! Hahahaha , this is something we have spent years building in our minds, and now it is It’s clear at a glance, there’s something about this digital book!”

They didn’t understand many of the technical details that Zhou Zhi just explained to them, but the applications supported by these technical details were so familiar to them. I couldn't help but feel a sense of surprise of "meeting an old friend in a foreign country".

"Many sages during the May Fourth Movement were extremely disappointed with our Chinese studies and believed that there was no need for Chinese characters to exist or be transformed into characters." Zhou Zhi smiled and said: "This is a pessimistic estimate without any basis. However, until today, this It’s really incredible that there is still a market for this kind of poison.”

“You can’t say this in vain.” The two senior brothers are both serious academic people. Leng Yulong said: “You have to come up with a strong enough reason, otherwise it will be another May Fourth violation. The mistake of imitating others?”

(End of this chapter)

Previous Details Next