Ever needed english to czech dictionary or vice versa?

I sometimes need an english-czech and czech-english dictionary. There is a great site slovnik.cz that does its job pretty well. However, it's quite slow to navigate to browser, navigate to the site, wait to load, etc.

After I got bored with waiting I wrote translator from czech to english and vice versa. Here is the quick and dirty code:

. (Join-Path $powershellDir helpers\Format-Columns.ps1)

function download($url) {
    Write-Debug "Downloading $url"
    $webRequest = New-Object Net.WebClient
    $webRequest.Headers.Add("User-Agent", 'Mozilla/5.0 (Windows; U; Windows NT 6.0; cs; rv:1.9.1.5) '+
                            'Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729) from Posh')
    $webRequest.Headers.Add("Accept", 'text/html,application/xhtml+xml,application/xml;')
    $webRequest.Headers.Add("Accept-Language", 'cs')
    $webRequest.Headers.Add("Accept-Encoding", 'deflate')
    $webRequest.Headers.Add("Accept-Charset", 'windows-1250') #utf-8
    $webRequest.Encoding = [system.text.encoding]::UTF8
    $str = $webRequest.DownloadString($url)
    $str
}
function convert2xml($s) {
    $s = $s.Replace(' xmlns="http://www.w3.org/1999/xhtml"', '')
    Add-Type -Path (Join-Path $powershellDir bin\SgmlReaderDll.dll)
    
    $sr = new-object io.stringreader $s
    
    $sgmlReader = new-object Sgml.SgmlReader
    $sgmlReader.DocType = 'HTML';
    $sgmlReader.WhitespaceHandling = 'All';
    $sgmlReader.CaseFolding = 'ToLower';
    $sgmlReader.InputStream = $sr;
    
    $xml = new-object Xml.XmlDocument;
    $xml.PreserveWhitespace = $true;
    $xml.XmlResolver = $null;
    $xml.Load($sgmlReader);
    
    $sgmlReader.Close()
    $sr.Close()
    
    $xml
}

function parse($xml, $word) {
    $nodes = Select-Xml -Xml $xml -XPath '//div[@id="vocables_main"]/div[@class="pair"]'
    $res = @($nodes | 
        ? { $_.Node.span[0].InnerText -eq $word } |
        % { $_.Node.span[1].InnerText })
    $res += @($nodes | 
        ? { ($_.Node.span[0].InnerText -ne $word) -and $_.Node.span[0].InnerText.StartsWith($word) } |
        % { "[{0}] {1}" -f $($_.Node.span[0].InnerText), $_.Node.span[1].InnerText })
    $res
}

function run($url, $word) {
    $res = @(parse (convert2xml (download $url)) $word)
    if ($res.Count -lt 20) { $res | Format-Columns -autosize -maxcol 4 }
    else                   { $res | Format-Columns -autosize }
}

function Translate-ToEnglish {
    run "http`://slovnik.cz/bin/mld.fpl?vcb=$($args -join '+')&dictdir=encz.cz&lines=50" ($args -join ' ')
}

function Translate-ToCzech {
    run "http://slovnik.cz/bin/mld.fpl?vcb=$($args -join '+')&dictdir=encz.en&lines=50" ($args -join ' ')
}

Export-ModuleMember Translate-ToCzech, Translate-ToEnglish

There are two external dependencies:

  • Cmdlet Format-Columns. I wrote about it some time ago, you can download it here.
  • Assembly SgmlReader. I'll provide a download link to my copy below, however you can download latest version as well.

I inspected some APIs to get rid of working with HTML, but they returned only one possible translation (that most of the time wasn't the best choice). Anyway, I'm not done with them and maybe in near future I'll come up with translator based only on a decent web API.

Some examples

[1]: Translate-ToCzech put up
naložit                                              smířit se s                                          
navrhnout                                            smluvit předem                                       
podat                                                snést klidně                                         
předložit                                            spokojit se                                          
spokojit se s                                        ubytovat                                             
uvést (na scénu)                                     ubytovat se                                          
vystavit (na odiv)                                   uložit ke spaní                                      
zarazit (zastavit, hovor.)                           uspořádat                                            
zastrčit (meč)                                       vyvěsit                                              
zvednout (ceny)                                      vztyčit                                              
postavit                                             zabalit                                              
zvýšit (ceny)                                        zavařit                                              
vykasat                                              zbudovat (přen.)                                     
pobízet                                              [put up a post for] vypsat konkurz                   
poskytnout nocleh                                    [put up a resistance] klást odpor                    
konzervovat                                          [put up at] zarazit kde (zastavit, hovor.)           
balit                                                [put up at] ubytovat kde                             
nabízet se                                           [put up at] ubytovat se kde                          
najít úkryt                                          [put up for sale] prodávat                           
nastrojit (podfuk, přen.)                            [put up for sale] dát do prodeje                     
navádět                                              [put up resistance] vzepřít se (slov.) g             
plašit                                               [put up the message] zobrazí zprávu                  
položit nahoru                                       [put up the shutters] nechat obchodu (přen.)         
ponoukat                                             [put up the shutters] stáhnout rolety (u obchodu)    
předvolat                                            [put up the shutters] zavřít krám

[2]: Translate-ToEnglish smířit se
bury the hatchet                                     [smířit se s] make one's peace with                  
do                                                   [smířit se s] make peace with                        
reconcile oneself g                                  [smířit se s] put up                                 
make it up                                           [smířit se s] reconcile to                           
make one's peace                                     [smířit se s] reconcile with                         
make peace                                           [smířit se s] sit down under                         
make up                                              [smířit se s] acquiesce (čím)                        
reconcile                                            [smířit se s] put up with (čím)                      
reconcile to                                         [smířit se s] sit down under (st.) (čím)             
resign to                                            [smířit se s čím] acquiesce                          
make it up (po hádce)                                [smířit se s čím] do with st.                        
take (s čím)                                         [smířit se s čím] put up with st.                    
[smířit se s] do with                                [smířit se s čím] sit down under st.                 
[smířit se s] make it up                             [smířit se se ztrátou] sacrifice

I load this module in my profile by default and create aliases like this:

Set-Alias tocz Translate-ToCzech
Set-Alias toen Translate-ToEnglish

In case you have Posh console opened somewhere and you don't want to look for it in you taskbar, you may appreciate hiding and bringing up the console via AutoHotkey.

Download

Meta: 2010-01-26, Pepa

Tags: PowerShell