Zawgyi to unicode converter. Integrating autoconversion: Facebook’s path from Zawgyi to Unicode

Zawgyi To Unicode Conversions Tips, Tricks and Risks

zawgyi to unicode converter

Unicode was designed as a global system to allow everyone in the world to use their own language on their devices. But most devices in Myanmar still use Zawgyi, which is incompatible with Unicode. With a model trained on text created in Zawgyi and Unicode, we can assess the probability that a given string was created with a Zawgyi or a Unicode keyboard. Running our detection and conversion every time someone fetches any type of content would be prohibitive in terms of the time and resources required. Because we know this transition will take time, our Zawgyi-to-Unicode converter will continue to allow people transitioning to Unicode to read posts, messages, and comments even if their friends and family they have not yet transitioned their devices.

Next

Burmese Font Converter

zawgyi to unicode converter

Also, not every string of code points makes sense in both encodings. Best Practise In summary, this method is more faster than the first one. Contributors - Ko Thant Thet Khin Zaw - Ko Aung Myo Kyaw - Ko Thixpin - San Lin Naing Contributors - Ko Thant Thet Khin Zaw - Ko Aung Myo Kyaw - Ko Thixpin - San Lin Naing Show More. Zawgyi supports entering only Burmese text, while Unicode enables entering minority languages spoken in Myanmar, like Shan and Mon. Database Conversion Method In this method, we are going to convert from the database level. In order to better reach their audiences, content producers in Myanmar often post in both Zawgyi and Unicode in a single post, not to mention English or other languages. But there are some risks, problems and limitations in this method.

Next

GitHub

zawgyi to unicode converter

The other one is for expert users, whose have database and server management experience. Feel free to or leave a comment. Implementing autoconversion across our products was not a simple task. To do this, we can take advantage of the fact that in one encoding, combining several code points will combine text fragments to create a single character, while in the other encoding those two code points might represent separate characters. Unfortunately, Zawgyi and Unicode use the same range of code points to represent characters in Burmese and other languages. Want to know more useful tips like this easily and automatically? These tools will make a big difference for the millions of people in Myanmar who are using our apps to communicate with friends and family.

Next

Rabbit Zawgyi Unicode Converter for iOS

zawgyi to unicode converter

The font converter is now fully implemented on Facebook and Messenger. Once we have this information, we can tell the server in future web requests that the device is using Zawgyi or Unicode and make sure any content that is fetched matches. In summary, this method is very easy. Conversion Next, the server checks whether it is loading Burmese content. Please give a comment for the download link. This post will detail the technical challenges involved in integrating these converters, including how we differentiate Zawgyi text from Unicode, how we can tell whether a device uses Zawgyi or Unicode, and how to convert between the two, as well as some lessons we learned along the way. About a year ago, we integrated font detection and conversion to convert all content into Unicode before going through our classifiers.

Next

Burmese Font Converter

zawgyi to unicode converter

Integrating autoconversion at Facebook scale The next challenge was to integrate this conversion across the different types of content that people can create on our apps. We also intend to continue contributing to the open source myanmar-tools library to help others build tools to support this transition. Each of the requirements for the autoconversion — content encoding detection, device encoding detection, and conversion — had its own challenges. Zawgyi text has been entered for status updates as well as for user names, comments, video subtitles, private messages, and more. This extension is freeware and source code can be found here. The lack of standardization around Unicode makes automation and proactive detection of violating content harder, it can weaken account security, it makes reporting potentially harmful content on Facebook less efficient, and it means less support for languages in Myanmar beyond Burmese. Facebook supports Unicode because it offers support and a consistent standard for every language.

Next

Integrating autoconversion: Facebook’s path from Zawgyi to Unicode

zawgyi to unicode converter

This model keeps track of how likely a series of code points is to occur in Unicode versus in Zawgyi for each sample. Simple Conversion Method In this method, we are going to convert post by post basic. In this article, I will share my experience and conversion processes. To continue supporting the people of Myanmar through this transition to Unicode, we are exploring expanding our autoconversion tools to more of the Facebook family of products, as well as improving the quality of our automatic detection and conversion. This makes any kind of communication between systems a huge challenge. This article is divided into two parts.

Next

Integrating autoconversion: Facebook’s path from Zawgyi to Unicode

zawgyi to unicode converter

Thank you for your support to the community. This is a cross post from mmshare. This extension will check web content and convert to Unicode encoded text if they are Zawgyi. And also it is a High Risk Process. If we create a string on-device and check the width of that string, we can tell which font encoding the device is using to render the string. It makes communication on digital platforms difficult, as content written in Unicode appears garbled to Zawgyi users and vice versa. Also, messages and comments are often very short, lowering detection accuracy.

Next

Get MUA Web Unicode Converter

zawgyi to unicode converter

This method is suitable for most users. If our database files are very large and the internet connection is too slow, our import process is difficult to success. We are very familiar with Zawgyi too long. One for normal users, whose without knowledge of — database and server management. Instead, Zawgyi is the dominant typeface used to encode Burmese language characters. It is not very easy for every one, to convert all our — documents, files, web pages, blog posts and database — writing in Zawgyi to 100% Unicode character fonts Myanmar3, Padauk, Parabaik etc.

Next