Privacy Advisor

How Baidu Wraps Privacy Into New Products

January 28, 2014

By Sam Pfeifle
Publications Director

Richard Lee, global marketing director at Chinese search giant Baidu, knows that Chinese companies have a certain perception to overcome in terms of privacy.

“A lot of Western people, in Europe or North America, they don’t understand China,” he said. “They think it’s like the old films. But China is actually doing a great deal to keep in line with modern times. … I agree that maybe we at Baidu need to do more to prove that we respect privacy than some Western companies, but we don’t lack those kinds of concepts here in China. We want to keep in line with international standards.”

This is why the company is eager to talk about the privacy controls Baidu has put in place as part of its product development efforts. For example, take the company’s introduction of its IME and Simeji products into the Japanese marketplace. IME, for desktop, and Simeji, for mobile, are software products with two million and eight million users, respectively, that allow for computer recognition of Japanese characters, making typing faster. Both IME and Simeji are jointly developed by a team featuring Japanese and Chinese developers. 

To make the software perform better and make user input highly accurate as it predicts what is being typed in real time, the program connects back to Baidu servers, where the nearly infinite character stroke combinations can be processed and better suggestions fed back.

“Practically,” Baidu product managers Jiang Feng and Su Tianhuang told The Privacy Advisor through a translator, “each single user will have their own daily used vocabulary dictionary.”

Right from the outset, Baidu’s product team realized there were a number of privacy issues to be considered. If every keystroke is being sent back to central Baidu servers, there would be the possibility that users’ intimate writings would be accessible both to Baidu and to potential hackers.

Taking direction from Robin Li, Baidu’s founder, who studied in the U.S. and embraced Silicon Valley culture, Lee said “at the very beginning of product development, privacy is playing a primary role. We bring in different departments to contribute. Our legal team at the beginning will give specific information about what kinds of rules we need to obey and what kind of potential risks we might encounter in each market. Then, as the product develops, we have a group of privacy experts who will focus particularly on privacy protections.”

In the case of IME, the team recognized from the start it would need to make it so full sentences or messages were never stored in full on a single server.

“The data storage in the server is all anonymized,” said the product managers through the translator. “We cannot restore which users were actually entering which data … When a user is inputting a complete sentence, the sentence is cut into different parts and transmitted to different servers. That’s how we ensure that our company cannot understand what is being written.”

Further, it’s built into the program that things like roman letters or numerals won’t be transmitted or stored at all, and the company has developed robust firewall technology to protect the servers themselves from intrusion, even as they are placed within the borders of security-sensitive countries like Japan.

Also, from the outset, Baidu makes sure to explain exactly what the software does upon installation and why it uses the cloud, and the user must consent to sending information to the cloud in the first place. Nor does the software have to make use of the cloud—without it, there is still functionality for inputting characters. Its accuracy just isn't as high without the cloud.

As an example of a suggestion made by the privacy team during product development, the developers said it was decided midstream to use HTTPS encryption for all transmissions of data to the cloud. They decided to treat every keystroke like an online payment in terms of security.

“We didn’t hit privacy perfection in one stop,” said the developers, “but we achieved what we ended up with gradually.”

As Baidu continues to look at expansion beyond China’s borders into emerging markets like Thailand, Egypt, Brazil and elsewhere—it is already the second largest search engine in the world, processing some 10 billion search queries a day—Lee said a dedication to privacy is part of the company’s communications whenever it approaches a new market.

“We want our global users to know the effort we put into protecting personal data,” he said.

Read More By Sam Pfeifle:
A New Handy Guide to Global DPAs
CES Buzzes With Privacy News
The Year's Top 10 Stories in the Privacy Advisor
German Parliament Elects New Federal Data Protection Commissioner