Need help understanding this Python Viterbi algorithm -


I am trying to replace a Python implementation of the Viterbi algorithm found in Ruby. The whole story can be found in the lower part of this question with my observations.

Unfortunately, I know very little about Python so that the translation is proving as difficult as possible for me. Even so, I have made some progress now, the only line that is completely melting my mind is this:

  prob_k, k = max ((probs [j] * word_prob (text [J: i]), j) Does anyone please tell me what is this doing?  

Here is the full Python script: imported by importer imported by # text to the group, the group will have a 'compound word' such as 'bad weather' DEF viterbi_segment (text): probe, stays = [1.0], [0] # in compound Iterate on letters # Such as [W, icked icked we ather ather], [y, cecade weather] and so on. I (1, lane + 1) in the category: # I do not know what this line is doing and I Range (max (0, i - max_word_length), i) adding value to # ARMs, prob_k, k = max ((probs [j] * word_prob (text [j: I]), j) keeps Probs.append (prob_k) .append (k) word = [] i = len (text) while 0 & lt; i: words.append (text [i]: i] ) I = remains [i] word. Reverse () returns word, probes [-1] # words in dictionary The likely glossary exists Def word_prob (words): # dictionary.get (key) will return value to the specified key. # In this case, the number of occurrences of word thw in the word # words. The second argument is a basic value, if the word is not found, then return it. Return dictionary.get (word, 0) / total # ensures that we deal with it rather than full letters, instead of each # separate letter. Definitely make words normal. Def word (text): Return refund ('[A-Z +]', text.over ()) # This gives us a hash where there are key words and the value of the dictionary #Occurrence the number is. Dictionary = dict ((w, lane (list (ws))) # / usr / share / dixt / The word Nillin is a file of marginal words. For W, by group (sorted (words (open ('/ usr Specify the length of the word in the longest word in the dictionary. Max_word_length = max (map (lane, dictionary)) Assign the total number of words in dictionary # this one Float is # because we are going to divide it later. Total = float (sum (dictionary. (Value)) # Finite words for running algo on a file of new line. Conjunction = words (open ('compounds.txt'). Read compounds for compounds ()) Print vibration, ":", viterbi_segment (comp)

The extended version looks like this:

  all_probs = [class = "post-text" itemprop = "text"> 

range in j (max (0 , I - max_word_length), i): all_probs.append ((probs [j] * word_prob (text [j: i]), j) prob_k, k = max (all_probs)

I hope that helps explain it. If this does not happen, feel free to edit your question and point out the statements that you do not understand.

Comments

Popular posts from this blog

Python SQLAlchemy:AttributeError: Neither 'Column' object nor 'Comparator' object has an attribute 'schema' -

java - How not to audit a join table and related entities using Hibernate Envers? -

mongodb - CakePHP paginator ignoring order, but only for certain values -