Friday, June 19, 2015

Code Project: AnkiTranslate

CLICK HERE to see the full app on GitHib.

Coming out of a medical background, I can't say I'm one for memorizing much of anything (memorizing hundreds of pages/slides at a time taught me 2 things: #1 amazing time management skills, #2 you forget everything you don't directly apply and use often).. BUT I will say that for learning a language you need to memorize vocabulary and that Anki is the tool out there to help you. Anki is simple and free, its flash cards that you make and that do spaced repetition (it hits you with cards you missed at a time interval you can set that is good for you). However, I wrote hours and hours of notes and cards and well, that's no fun.

In a language, I've seen most people say you need about 2000-5000 words to be conversational. And a lot of English speakers speak around 15,000-20,000 words. And crazy specialized folks who go deep into an area (could be medicine, science, whatever).. may know 30,000 or so words. So let's stick with 2000-5000 words... if I am FAST and can make 1 Anki card a minute, I will spend 5000 hours making flash cards.. which is 83 hours. Yes, yes making the cards *might* help me learn it a little more. But I do think flipping through the cards for 83 hours I could be done with at least a good portion of the cards and be speaking / interacting with people instead of still making more cards.

I could spend said 83 hours making a TON of flashcards for random vocabulary. Or I could spend 2 hours making a little app, drop it on github for anyone to use / see, and get to program a little :D.. and drop my time making flash cards down to about 30 seconds). I can find plenty of lists of top / most common 100 or 1000 or such English words, drop this into the program and get flashcards off of a simple copy and paste.

Using the app: input a text file of words you want to translate out and separate each word to translate with a newline (or edit the app to deal with CSV or whatever format or all formats..).. let AnkiTranslate know which language you are starting with and which one you want outputted.. and then..

Output: you will get a text file you can import into Anki that will make your flash cards. Anki has the front of the card, separated by a tab, the back of the card, then a new line for a new card. So as such the format of the output will be your initial word, a tab, the translated word, and a newline for a new card.

Caveat: This is good for vocabulary and short phrases. Expect that a translator will, well, mess up grammar and make you sound funny if you try and translate a ton of phrases. My purpose is to use this to get mass amount of pure one word vocabulary down, then I will study grammar and sentence strucutre on its own and put together the words I learn from this. Sometimes the meaning of the word won't be the one you think of, it happens.. but it'd happen anyways using a translator. If I get 95% of the words right off a translator, that's still pretty good and I'll fix them when I use it wrong and am corrected :P.

Here's the fun: planning. I decided on a WPF because it's easy for me to navigate as I've worked on projects with it before, the UI is simple and looks nice (I say nice relatively, it works, I don't care about design as much as function and logistical use), so on the CS code behind of main window I made a plan.. I usually *try* my best to bullet out a good order of what I am working on, and it almost always partially turns out that way:
namespace AnkiTranslate
{
    /// 
    /// Interaction logic for MainWindow.xaml
    /// 
    public partial class MainWindow : Window
    {
        public MainWindow()
        {
            InitializeComponent();
        }
        private void Translate_Click(object sender, RoutedEventArgs e)
        {
            // use native windows importer for file

            // display file path on textbox

            // parse file into string

            // google translate API work

            // parse translator API work + initial values into anki format

            // native windows save file location option

            // export .txt file

            // display success message on textbox
            
        }
    }
}
And when I dropped some text boxes onto a window it looks like this:



I ran into a snag and realized that Google does not offer free translating at all anymore, but Microsoft does. I went to the Microsoft Translator API and followed the directions there, it was easy to plug right into my app. This is the expanded version of my initial plan.
namespace AnkiTranslate
{
    /// 
    /// Interaction logic for MainWindow.xaml
    /// 
    public partial class MainWindow
    {
        public MainWindow()
        {
            InitializeComponent();

            Languages languageChoices = new Languages();
            languageChoices.Populate();
        }

        private void ComboBoxFrom_Loaded(object sender, RoutedEventArgs e) { ComboboxDoWork(sender); }

        private void ComboBoxTo_Loaded(object sender, RoutedEventArgs e) { ComboboxDoWork(sender); }

        private void ComboboxDoWork(object sender)
        {
            var comboBox = sender as ComboBox;

            if (comboBox == null) return;
            comboBox.ItemsSource = ConfigClass.Languages;
            comboBox.SelectedIndex = 0;
        }

        private void Translate_Click(object sender, RoutedEventArgs e)
        {
            ConfigClass.LanguageTranslatedFrom = ((ConfigClass.ComboboxItem) lanFromComboBox.SelectedValue).Value;
            ConfigClass.LanguageToTranslateTo = ((ConfigClass.ComboboxItem) lanToComboBox.SelectedValue).Value;

            var openFileDialog = new OpenFileDialog {Filter = "Text Files (.txt)|*.txt", FilterIndex = 1, Multiselect = true};
            bool? userConfirmation = openFileDialog.ShowDialog();

            if (userConfirmation != true) return;
            string file = openFileDialog.FileName;

            try
            {
                ConfigClass.TextToTranslate = File.ReadAllText(file);
                MsgBoxLabel.Content = file;
            }
            catch (IOException) { throw new Exception("Something went wrong, eh?"); }

            ConfigClass.TranslatedText = new MicrosoftTranslator().Translate();

            // DOWORK to make anki format
            string[] textToTranslateArray = ConfigClass.TextToTranslate.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.None);
            string[] translatedTextArray = ConfigClass.TranslatedText.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.None);
            string finalText = "";

            for (int i = 0; i < textToTranslateArray.Length - 1; i++)
            {
                finalText += textToTranslateArray[i] + "\t" + translatedTextArray[i] + System.Environment.NewLine;
            }

            var saveDialog = new SaveFileDialog { Filter = "Text Files (.txt)|*.txt", FilterIndex = 1};
            bool? userSaveConfirmation = saveDialog.ShowDialog();

            if (userSaveConfirmation != true) return;
            string savePath = @saveDialog.FileName;

            File.WriteAllText(savePath, finalText);
            MsgBoxLabel.Content = "Successfully saved to: " + saveDialog.FileName;
        }
    }
}



Since it's a small app, I just hardcoded in some languages (feel free to adjust as needed to whichever you are learning). The website that has a list of all possible languages is here: msdn.microsoft.com/en-us/library/hh456380.aspx
namespace AnkiTranslate
{
    public class Languages
    {
        // full list of supported languages here: msdn.microsoft.com/en-us/library/hh456380.aspx
        public void Populate()
        {
            var en = new ConfigClass.ComboboxItem { Text = "English", Value = "en" };
            var es = new ConfigClass.ComboboxItem { Text = "Spanish", Value = "es" };
            var bg = new ConfigClass.ComboboxItem { Text = "Bulgarian", Value = "bg" };
            var cn = new ConfigClass.ComboboxItem { Text = "Chinese", Value = "zh-CHS" };
            var tlh = new ConfigClass.ComboboxItem { Text = "Klingon", Value = "tlh" };

            ConfigClass.Languages = new List {en, es, bg, cn, tlh};
        }
    }
}


I also made a ConfigClass to hold some global variables instead of passing around local ones to my different functions.

namespace AnkiTranslate
{
    public class ConfigClass
    {
        public static string TextToTranslate { get; set; }
        public static string LanguageTranslatedFrom { get; set; }
        public static string TranslatedText { get; set; }
        public static string LanguageToTranslateTo { get; set; }

        public static List Languages { get; set; }

        public class ComboboxItem
        {
            public string Text { get; set; }
            public string Value { get; set; }
            public override string ToString()
            {
                return Text;
            }
        }
    }
}


You can see below with the variables clientId and strTranslatorAccessUri , I am referencing the ConfigurationManager.AppSettings which is hitting my App.config file keys that I added. This is the actual work going to the Microsoft API and returning translated text.

namespace AnkiTranslate
{
    public class MicrosoftTranslator
    {
        public string Translate()
        {
            string clientId = ConfigurationManager.AppSettings["clientID"];
            //string clientSecret = ConfigurationManager.AppSettings["clientSecret"];
            string strTranslatorAccessUri = ConfigurationManager.AppSettings["strTranslatorAccessURI"]; 

            String strRequestDetails = string.Format("grant_type=client_credentials&client_id={0}&client_secret={1}&scope=http://api.microsofttranslator.com", HttpUtility.UrlEncode(clientId), HttpUtility.UrlEncode(clientSecret));

            System.Net.WebRequest webRequest = System.Net.WebRequest.Create(strTranslatorAccessUri);
            webRequest.ContentType = "application/x-www-form-urlencoded";
            webRequest.Method = "POST";

            byte[] bytes = System.Text.Encoding.ASCII.GetBytes(strRequestDetails);
            webRequest.ContentLength = bytes.Length;

            using (var outputStream = webRequest.GetRequestStream()) outputStream.Write(bytes, 0, bytes.Length);

            System.Net.WebResponse webResponse = webRequest.GetResponse();

            var serializer = new System.Runtime.Serialization.Json.DataContractJsonSerializer(typeof(AdmAccessToken));

            //Get deserialized object from JSON stream 
            AdmAccessToken token = (AdmAccessToken)serializer.ReadObject(webResponse.GetResponseStream());
            if (token == null) throw new ArgumentNullException("token");

            string headerValue = "Bearer " + token.access_token;

            // User input text to translate plus chosen TO and FROM languages 
            string uri = "http://api.microsofttranslator.com/v2/Http.svc/Translate?text=" +
                HttpUtility.UrlEncode(ConfigClass.TextToTranslate) + "&from=" + ConfigClass.LanguageTranslatedFrom + "&to=" + ConfigClass.LanguageToTranslateTo;
            System.Net.WebRequest translationWebRequest = System.Net.WebRequest.Create(uri);
            translationWebRequest.Headers.Add("Authorization", headerValue);
            System.Net.WebResponse response = null;
            response = translationWebRequest.GetResponse();
            Stream stream = response.GetResponseStream();

            Encoding encode = Encoding.GetEncoding("utf-8");
            var translatedStream = new StreamReader(stream, encode);
            System.Xml.XmlDocument xTranslation = new System.Xml.XmlDocument();
            xTranslation.LoadXml(translatedStream.ReadToEnd());

            return xTranslation.InnerText;
        }
    }
}


The finished product looks like the following:


If you do get to use this app, here's a good sample of the top 1000 words and how useful they are by example: http://splasho.com/upgoer5/

Also, this is *perfect* for the app to translate to another language.. 1000 words separated by a new line. http://splasho.com/upgoer5/phpspellcheck/dictionaries/1000.dicin

Here's my use of the app and the output file (plugs right into Anki and I get my deck! Whoo hoo! 80 hours to play more games..): 1000 most common vocabulary words in English / Bulgarian

If you want to try it out yourself, have a go! (I just uploaded the release version to Google Drive so you can download from there. If anyone actually wants to use this and doesn't program / wants me to add more languages just ping me on the contact form and I can add the rest..) Download