<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-9084486178517287444</id><updated>2011-11-27T16:22:46.598-08:00</updated><category term='pdflatex'/><category term='linux'/><category term='Introduction'/><category term='id3v2'/><category term='tracklist'/><category term='id3'/><category term='xmms2 xbindkeys seek audio debian'/><category term='webscraper'/><category term='enconding'/><category term='language learning audio russian debian soundstretch ffmpeg mplayer'/><category term='colorize'/><category term='convert'/><category term='latex'/><category term='batch renaming'/><category term='deixto'/><category term='configure'/><category term='join sort'/><category term='web data extractor'/><category term='language'/><category term='syntax highlighting'/><category term='mutagen'/><category term='converting'/><category term='bash'/><category term='palehui'/><category term='sed'/><category term='pdf'/><category term='library'/><category term='mac osx'/><category term='www::mechanize'/><category term='shell'/><category term='pretty printer'/><category term='pdfimages'/><category term='html'/><category term='debian'/><category term='nahuatl'/><category term='iconv'/><category term='regular expressions'/><category term='mid3v2'/><category term='imagemagick'/><category term='code beautifier'/><category term='unicode'/><category term='dpi'/><category term='vim'/><category term='join sort uniq sed cut'/><category term='pkg-config'/><category term='library_path'/><category term='cp1251'/><title type='text'>Technical stuff mostly Linux related</title><subtitle type='html'>linux,scripts,physics,hardware</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>16</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-7824951941324674037</id><published>2011-03-25T09:41:00.000-07:00</published><updated>2011-03-25T09:51:12.826-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='join sort uniq sed cut'/><title type='text'>Most recent marks</title><content type='html'>&lt;i&gt;Problem&lt;/i&gt; We have two files with the grades of the students. One file representing the first exam&lt;b&gt; &lt;/b&gt;-&lt;b&gt;&amp;nbsp; marks1.txt &lt;/b&gt;-&lt;b&gt; &lt;/b&gt;the second file the second exam (for mending ones mark - thus overwriting the first mark) - &lt;b&gt;marks2.txt&lt;/b&gt;.&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;The second exam though is optional.&lt;br /&gt;&lt;b&gt; &lt;/b&gt;&lt;br /&gt;&lt;b&gt; &lt;/b&gt;The exam file syntax is:&lt;br /&gt;&lt;b&gt;&lt;student-id&gt; &lt;grade&gt;&lt;/grade&gt;&lt;/student-id&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For sorting we have to be able to identify from which file the entry stems from - so we prepend a "&lt;b&gt;file-id"&lt;/b&gt; column - with value &lt;b&gt;2&lt;/b&gt; for &lt;b&gt;1st exam&lt;/b&gt; and &lt;b&gt;1&lt;/b&gt; for &lt;b&gt;2nd exam&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;We sort now both files, firstly on the &lt;b&gt;student-id&lt;/b&gt; (column 2) and secondly on the &lt;b&gt;file-id&lt;/b&gt; (column 1).&lt;br /&gt;&lt;br /&gt;i.e.:&lt;br /&gt;&lt;b&gt;marks1.txt&lt;/b&gt;: &lt;br /&gt;012&amp;nbsp; 3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; --&amp;gt;&amp;nbsp;&amp;nbsp; 2 012 3&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;marks2.txt&lt;/b&gt;:&amp;nbsp; &lt;br /&gt;012&amp;nbsp; 4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; --&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1 012 4&lt;br /&gt;&lt;br /&gt;&lt;pre style="background: none repeat scroll 0% 0% rgb(246, 248, 255); color: #000020;"&gt;sort &lt;span style="color: #44aadd;"&gt;-k&lt;/span&gt; &lt;span style="color: #008c00;"&gt;2&lt;/span&gt;,&lt;span style="color: #008c00;"&gt;2&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-k&lt;/span&gt; &lt;span style="color: #008c00;"&gt;1&lt;/span&gt;,&lt;span style="color: #008c00;"&gt;1&lt;/span&gt; &lt;br /&gt;       &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;         &lt;span style="color: #7779bb; font-weight: bold;"&gt;sed&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-e&lt;/span&gt; &lt;span style="color: #1060b6;"&gt;'&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;s&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #308080;"&gt;^&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;2 &lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;'&lt;/span&gt; marks&lt;span style="color: #200080;"&gt;1&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;        &lt;span style="color: #406080;"&gt;)&lt;/span&gt; &lt;br /&gt; &lt;br /&gt;       &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;         &lt;span style="color: #7779bb; font-weight: bold;"&gt;sed&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-e&lt;/span&gt; &lt;span style="color: #1060b6;"&gt;'&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;s&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #308080;"&gt;^&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;1 &lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;'&lt;/span&gt; marks2&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;        &lt;span style="color: #406080;"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;After sorting we obtain:&lt;br /&gt;1 012 4&lt;br /&gt;2 012 3&lt;br /&gt;&lt;br /&gt;If we have duplicates we only want to retain the first line (the most recent mark).&lt;br /&gt;So lets first remove the &lt;b&gt;file-id&lt;/b&gt; column again with a &lt;br /&gt;&lt;pre style="background: none repeat scroll 0% 0% rgb(246, 248, 255); color: #000020;"&gt;cut --&lt;span style="color: #007d45;"&gt;delimiter&lt;/span&gt;&lt;span style="color: #308080;"&gt;=&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;' '&lt;/span&gt; --&lt;span style="color: #007d45;"&gt;fields&lt;/span&gt;&lt;span style="color: #308080;"&gt;=&lt;/span&gt;&lt;span style="color: #008c00;"&gt;2&lt;/span&gt;- &lt;/pre&gt;&lt;br /&gt;Then remove duplicates, checking on the &lt;b&gt;student-id &lt;/b&gt;column (1st column 3 bytes wide):&lt;br /&gt;&lt;pre style="background: none repeat scroll 0% 0% rgb(246, 248, 255); color: #000020;"&gt;uniq &lt;span style="color: #44aadd;"&gt;-w&lt;/span&gt; 3&lt;/pre&gt;&lt;br /&gt;Finally, we wan to link the student-ids to the student's names.&lt;br /&gt;Let's say we have a file &lt;b&gt;id_names.txt&lt;/b&gt; with the format:&lt;br /&gt;&lt;b&gt;&lt;student-id&gt; &lt;student-name&gt;&lt;/student-name&gt;&lt;/student-id&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;All we have to do is make a join the &lt;b&gt;student-id&lt;/b&gt; column.&lt;br /&gt;&lt;br /&gt;The whole code:&lt;br /&gt;&lt;pre style="background: none repeat scroll 0% 0% rgb(246, 248, 255); color: #000020;"&gt;join &lt;br /&gt;  &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;    sort &lt;span style="color: #44aadd;"&gt;-k&lt;/span&gt; &lt;span style="color: #008c00;"&gt;2&lt;/span&gt;,&lt;span style="color: #008c00;"&gt;2&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-k&lt;/span&gt; &lt;span style="color: #008c00;"&gt;1&lt;/span&gt;,&lt;span style="color: #008c00;"&gt;1&lt;/span&gt; &lt;br /&gt;       &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;         &lt;span style="color: #7779bb; font-weight: bold;"&gt;sed&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-e&lt;/span&gt; &lt;span style="color: #1060b6;"&gt;'&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;s&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #308080;"&gt;^&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;2 &lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;'&lt;/span&gt; marks1&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;        &lt;span style="color: #406080;"&gt;)&lt;/span&gt; &lt;br /&gt; &lt;br /&gt;       &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;         &lt;span style="color: #7779bb; font-weight: bold;"&gt;sed&lt;/span&gt; &lt;span style="color: #44aadd;"&gt;-e&lt;/span&gt; &lt;span style="color: #1060b6;"&gt;'&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;s&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #308080;"&gt;^&lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;1 &lt;/span&gt;&lt;span style="color: maroon;"&gt;/&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;'&lt;/span&gt; marks2&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;        &lt;span style="color: #406080;"&gt;)&lt;/span&gt;&lt;br /&gt;    &lt;span style="color: #e34adc;"&gt;|&lt;/span&gt;&lt;br /&gt;     cut --&lt;span style="color: #007d45;"&gt;delimiter&lt;/span&gt;&lt;span style="color: #308080;"&gt;=&lt;/span&gt;&lt;span style="color: #1060b6;"&gt;' '&lt;/span&gt; --&lt;span style="color: #007d45;"&gt;fields&lt;/span&gt;&lt;span style="color: #308080;"&gt;=&lt;/span&gt;&lt;span style="color: #008c00;"&gt;2&lt;/span&gt;- &lt;br /&gt;    &lt;span style="color: #e34adc;"&gt;|&lt;/span&gt;  &lt;br /&gt;     uniq &lt;span style="color: #44aadd;"&gt;-w&lt;/span&gt; &lt;span style="color: #008c00;"&gt;3&lt;/span&gt;&lt;br /&gt;   &lt;span style="color: #406080;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt;&lt;br /&gt;    sort id_names&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;   &lt;span style="color: #406080;"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  &lt;span style="color: #e34adc;"&gt;&amp;gt;&lt;/span&gt; &lt;span style="color: #200080;"&gt;marks&lt;/span&gt;&lt;span style="color: #200080; font-weight: bold;"&gt;.&lt;/span&gt;txt&lt;br /&gt;&lt;/pre&gt;d&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-7824951941324674037?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/7824951941324674037/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=7824951941324674037' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/7824951941324674037'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/7824951941324674037'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2011/03/most-recent-marks.html' title='Most recent marks'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-252151406852411033</id><published>2011-03-24T16:33:00.000-07:00</published><updated>2011-03-24T16:33:50.915-07:00</updated><title type='text'>online syntax highlighting</title><content type='html'>For generating html-code for syntax hightlighting of code I use an online service.&lt;br /&gt;&lt;br /&gt;For the shell scripts I use &lt;a href="http://tohtml.com/shell/"&gt;http://tohtml.com/shell/&lt;/a&gt; with style &lt;b&gt;navy&lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-252151406852411033?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/252151406852411033/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=252151406852411033' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/252151406852411033'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/252151406852411033'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2011/03/online-syntax-highlighting.html' title='online syntax highlighting'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-5119466943524357125</id><published>2011-03-24T16:31:00.000-07:00</published><updated>2011-03-24T16:34:23.412-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='join sort'/><title type='text'>join caveat</title><content type='html'>The &lt;b&gt;join&lt;/b&gt; command requires the key fields to be sorted in ascending order, otherwise it won't join them:&lt;br /&gt;&lt;br /&gt;&lt;pre style="background: none repeat scroll 0% 0% rgb(246, 248, 255); color: #000020;"&gt;join &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt; &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;file_A&lt;span style="color: #e34adc;"&gt;&amp;gt;&lt;/span&gt; &lt;span style="color: #406080;"&gt;)&lt;/span&gt; &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #406080;"&gt;(&lt;/span&gt; &lt;span style="color: #e34adc;"&gt;&amp;lt;&lt;/span&gt;file_B&lt;span style="color: #e34adc;"&gt;&amp;gt;&lt;/span&gt; &lt;span style="color: #406080;"&gt;)&lt;/span&gt;&amp;nbsp; &lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-5119466943524357125?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/5119466943524357125/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=5119466943524357125' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/5119466943524357125'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/5119466943524357125'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2011/03/sort-caveat.html' title='join caveat'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-1782131326237379706</id><published>2010-09-24T12:58:00.000-07:00</published><updated>2010-09-24T13:07:38.587-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='language learning audio russian debian soundstretch ffmpeg mplayer'/><title type='text'></title><content type='html'>For drilling my listening comprehension skills in Russian I use audio news where they provide a reasonable faithful transcript (so far I haven't found a site, where they provided a 1:1 transcript).&lt;br /&gt;In addition the news anchormen often speak a very fast Russian, so to be even able to understand it initially I have to slow the tempo of the audio down, but for that I have to download it:&lt;br /&gt;&lt;br /&gt;On &lt;a href="http://www.ntv.ru/novosti"&gt;http://www.ntv.ru/novosti&lt;/a&gt; you can download videos&lt;br /&gt;+ the transcription are supplied&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The downloaded video is in the mp4 format&lt;br /&gt;2 options now&lt;br /&gt;1a) convert with mplayer into wav-format:&lt;br /&gt;&amp;nbsp;&amp;nbsp; mplayer -ao pcm:file="out.wav" "in.mp4"&lt;br /&gt;&lt;br /&gt;1b) convert with ffmpeg directly into mp3-format:&lt;br /&gt;&amp;nbsp;&amp;nbsp; ffmpeg -i "in.mp4" -acodec libmp3lame -ab 128 "out.mp3"&lt;br /&gt;&lt;br /&gt;2) If tempo is too fast to comprehend, slow the tempo with soundstretch (pitch won't be changed!):&lt;br /&gt;&amp;nbsp;&amp;nbsp; soundstretch "in.wav" "out.mp3" -tempo=-20&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;That slows down the tempo 20%.&lt;br /&gt;Soundstretch needs a wave-file as input!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-1782131326237379706?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/1782131326237379706/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=1782131326237379706' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/1782131326237379706'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/1782131326237379706'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2010/09/for-drilling-my-listening-comprehension.html' title=''/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-967295531064002146</id><published>2010-09-24T12:47:00.000-07:00</published><updated>2010-09-24T13:04:45.272-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='xmms2 xbindkeys seek audio debian'/><title type='text'>Enable seeking with hotkeys for xmms2</title><content type='html'>&lt;span style="font-size: large;"&gt;Intro&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I study foreign languages with audio-books and transcripts. If I haven't understood a word I seek back a few seconds in the audio - manually until now by dragging the button on the seek bar of my favorite soundplayer gui.&lt;br /&gt;This is of course tedious if your reading the transcript and have to refocus your eyes from the text to the seek bar and even seek back the correct amount of time.&lt;br /&gt;I needed something more apt.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;Specification&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;I'm running Debian 5 with Gnome and wanted&lt;/span&gt; &lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;soundplayer with cli interface and capable of sending the following commands:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;seek + amount of seconds to seek&lt;/li&gt;&lt;li&gt;toggle between play and stop&lt;/li&gt;&lt;li&gt;forward / backward track&lt;/li&gt;&lt;/ul&gt;&lt;li&gt; bind the commands to the multimedia keys of my keyboard&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-size: large;"&gt;Solution&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;I found the best solution being xmms2 as the soundplayer and xbindkeys for the key binding of the commands.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;h3&gt;xmms2&lt;/h3&gt;&lt;br /&gt;xmms2 comes in a client/server architecture.&lt;br /&gt;To start the sever in daemon mode invoke 'xmms2-launcher', but it can't be started as root.&lt;br /&gt;A rudimentary gui for xmms2 is gxmms2.&lt;br /&gt;&lt;br /&gt;Both xmms2 and xbindkeys have to be started each time you login, so I put them as autostart programs in 'System-&amp;gt;Session-&amp;gt;Autostart programs'.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;xbindkeys&lt;/h3&gt;&lt;br /&gt;At first I had problems binding the multimedia keys, but then I &lt;a href="http://wwww.ubuntuforums.org/showthread.php?p=5655202"&gt;read here&lt;/a&gt; that Debians/Ubuntus graphical keyboard shortcuts utility causes to suppress the keys of sending keycodes, as can be seen with the command 'xev'.&lt;br /&gt;Solution: Remove the binding for these keys with the graphical shortcut utility, afterwards the keycodes show up.&lt;br /&gt;&lt;br /&gt;A great tutorial on xbindkeys can &lt;a href="http://wiki.ubuntuusers.de/xbindkeys"&gt;be found here&lt;/a&gt;, for use of mouse, sending key combinations to programs (paket xautomation, program xte), graphical/sound responses &lt;br /&gt;&lt;br /&gt;My .xbindkeysrc looks as follows (I use a Cherry G230 keyboard):&lt;br /&gt;&lt;code&gt;&lt;br /&gt;#Toggle between play/pause&lt;br /&gt;"xmms2 toggleplay"&lt;br /&gt;m:0x0 + c:162&lt;br /&gt;&lt;br /&gt;#Seek +5 seconds&lt;br /&gt;"xmms2 seek +5"&lt;br /&gt;m:0x0 + c:153&lt;br /&gt;&lt;br /&gt;#Seek -5 seconds&lt;br /&gt;"xmms2 seek -5"&lt;br /&gt;m:0x0 + c:144&lt;br /&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-967295531064002146?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/967295531064002146/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=967295531064002146' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/967295531064002146'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/967295531064002146'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2010/09/enable-seeking-with-hotkeys-for-xmms2.html' title='Enable seeking with hotkeys for xmms2'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-8251245704012992962</id><published>2010-05-20T13:54:00.000-07:00</published><updated>2010-05-20T14:03:32.161-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tracklist'/><category scheme='http://www.blogger.com/atom/ns#' term='id3v2'/><category scheme='http://www.blogger.com/atom/ns#' term='id3'/><title type='text'>Link Tracklist metadata to id3-tags</title><content type='html'>Problem: You have a mp3 album, but the files have no id3-tags set, but you've found a tracklist with the format:&lt;br /&gt;&lt;br /&gt;&amp;lt;track-number&amp;gt;. &amp;lt;artist&amp;gt; - &amp;lt;title&amp;gt;&lt;br /&gt;&lt;br /&gt;i.e.:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;cat tracklist.txt | &lt;br /&gt; while read f; do&lt;br /&gt;  tr=$(echo "$f" | sed -e 's/\. .*//');&lt;br /&gt;  a= $(echo "$f" | sed -e 's/...\. \(.*\) - .*/\1/');&lt;br /&gt;  t=$(echo "$f" | sed -e 's/.* - \(.*\)/\1/');&lt;br /&gt;  mid3v2 -T "$tr" -a "$a" -t "$t" -A "100 хитов русского рока" -g 17&lt;br /&gt;    "$(find ./ -type f -iname "${tr}-*")";&lt;br /&gt; done&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;We pipe the tracklist 'tracklist.txt' to a loop, which does for each line (mp3-file):&lt;br /&gt;We extract the track# 'tr', the author 'a' and the title 't'.&lt;br /&gt;Then we set the id3v2.4 tags with mutagen mid3vs of the corresponding file - identified by the track#, we get the filename with a find-cmd.&lt;br /&gt;Additionally we set the Album (-A) to '100 хитов русского рока' (100 Russian rock hits).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-8251245704012992962?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/8251245704012992962/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=8251245704012992962' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/8251245704012992962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/8251245704012992962'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2010/05/link-tracklist-metadata-to-id3-tags.html' title='Link Tracklist metadata to id3-tags'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-312598350711378064</id><published>2010-05-20T13:38:00.000-07:00</published><updated>2010-05-20T13:55:05.226-07:00</updated><title type='text'>convmv - Convert encoding of filenames.</title><content type='html'>convmv is a perl script, which converts filenames from one encoding to another&lt;br /&gt;&lt;br /&gt;i.e.:&lt;br /&gt;&lt;pre&gt;$ convmv -f iso8859-15 -t utf-8 --notest *mp3&lt;/pre&gt;&lt;br /&gt;converts all mp3-files in the current directory&amp;nbsp; from iso8859-15 to utf-8.&lt;br /&gt;--notest is needed to actually convert the filenames, otherwise convmv does a dry-run and only prints what it would do.&lt;br /&gt;&lt;br /&gt;You can even convert a hole filetree (recursively) by adding the -r otpion.&lt;br /&gt;&lt;br /&gt;It also checks if the encoding is already in utf-8 and aborts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-312598350711378064?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/312598350711378064/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=312598350711378064' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/312598350711378064'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/312598350711378064'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2010/05/convmv-convert-encoding-of-filenames.html' title='convmv - Convert encoding of filenames.'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-7445167412844435968</id><published>2010-05-16T06:29:00.000-07:00</published><updated>2010-05-20T13:53:48.706-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mutagen'/><category scheme='http://www.blogger.com/atom/ns#' term='id3v2'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='id3'/><category scheme='http://www.blogger.com/atom/ns#' term='mid3v2'/><category scheme='http://www.blogger.com/atom/ns#' term='convert'/><title type='text'>Utf-8 id3v2.4 tags</title><content type='html'>The UTF-8 support of id3 &amp;amp; id3v2 are broken. With id3v2 -l &amp;lt;mp3-file&amp;gt; the output will be OK, but other programs won't be able to read the character-encoding and display Mojibake (scrambled text).&lt;br /&gt;&lt;br /&gt;Solution: use another id3-tagging tool, i.e. mutagen mid3v2 and mid3iconv.&lt;br /&gt;&lt;br /&gt;mid3iconv is very handy - it converts the id3-tags from a source character-encoding (specied with -e &amp;lt;character-set&amp;gt;, default:UTF-8) to UTF-8 and also writes the correct character-encoding metadata into the id3v2.4 tags, so they'll display well in other apps.&lt;br /&gt;&lt;br /&gt;If you've already converted the id3-tags to UTF-8 with id3/id3v2 you can repair them with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ find ./ -iname "*.mp3" | while read f; do  mid3iconv --remove-v1 -d  "$f"; done &lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-7445167412844435968?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/7445167412844435968/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=7445167412844435968' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/7445167412844435968'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/7445167412844435968'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2010/05/utf-8-id3v24-tags.html' title='Utf-8 id3v2.4 tags'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-345727663041276666</id><published>2009-09-20T08:12:00.000-07:00</published><updated>2010-02-28T11:01:30.053-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pdflatex'/><category scheme='http://www.blogger.com/atom/ns#' term='pdfimages'/><category scheme='http://www.blogger.com/atom/ns#' term='dpi'/><category scheme='http://www.blogger.com/atom/ns#' term='latex'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='pdf'/><category scheme='http://www.blogger.com/atom/ns#' term='imagemagick'/><category scheme='http://www.blogger.com/atom/ns#' term='convert'/><title type='text'>Resizing images in a pdf document</title><content type='html'>Today I'm going to resize an image within a pdf document from the command line (batch mode). There exists a plethora of tools available for free, but i will stick with the following ones:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;pdfimages&lt;/b&gt;&lt;/li&gt;&lt;li&gt;&lt;b&gt;convert &lt;/b&gt;(from the venerable imagemagick tool&lt;/li&gt;&lt;li&gt;kit)&lt;/li&gt;&lt;li&gt;&lt;b&gt;pdflatex&lt;/b&gt;&lt;/li&gt;&lt;li&gt;a pdf viewer of your choice&lt;/li&gt;&lt;/ul&gt;Because you can't really edit a pdf document (aside from some expensive proprietary tools i.e. Adobe Acrobat Writer) we have to extract the desired data, process it and output it into a new pdf document.&lt;br /&gt;&lt;br /&gt;So for image resizing the workflow will be:&lt;br /&gt;Extract image -&amp;gt; Resize image -&amp;gt; Output to new pdf document&lt;br /&gt;&lt;br /&gt;Sounds easy - doesn't it? But there's one caveat to it:&lt;br /&gt;With resizing I referred to the physical size of the image as is output on the printing device&lt;br /&gt;&lt;br /&gt;The output device (printer, screen) has a pixel density associated with it called dots per inch or &lt;a href="http://en.wikipedia.org/wiki/Dots_per_inch"&gt;dpi&lt;/a&gt; (read the definition if you're not familiar with it).&lt;br /&gt;If the dpis of the display device and the printer don't match, images who haven't got a fixed dpi associated with it will have different physical sizes on these devices - to sum it up understanding dpi is essential.&lt;br /&gt;&lt;br /&gt;To avoid this problem we have to determine the dpi value of the image in the pdf document and after resizing - output it with the same dpi value.&lt;br /&gt;&lt;br /&gt;Said that, actually there are two methods of resizing an image (physical size):&lt;br /&gt;&lt;ol&gt;&lt;li&gt;already mentioned above: original dpi&amp;nbsp; equals new dpi,&amp;nbsp; (pixel-)resizing the image&lt;/li&gt;&lt;li&gt;changing the new dpi but retaining the image (pixel-)size: original dpi doesn't equal new dpi&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size: x-small;"&gt;*With original/new dpi I refer to the dpi of the orignal/new pdf document&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The latter is admittedly the worse approach for output devices have a certain intrinsic dpi-value which yields best results, with other dpi-values the output device has to scale the data to its intrinsic dpi-value.&lt;br /&gt;Of course you have no control over the algorithms applied in this process - in stark contrast to the first method.&lt;br /&gt;&lt;br /&gt;Unfortunately I haven't found an easy way to determine the dpi-value of an image within a pdf-document.&lt;br /&gt;In Acrobat Reader Professional 6+ there's&amp;nbsp; allegedly a tool called "preflight" and another called "pitstop" who can extract this information - both very expensive.&lt;br /&gt;&lt;br /&gt;My approach is to guess the dpi-value, generate the pdf document, and compare the image size in a pdf viewer with the original pdf document. (assuming equal horizontal and vertical dpi-values).&lt;br /&gt;This works quite well for most pdf documents, but there also exist documents where the horizontal differs form the vertical dpi-value. In this case you first adjust the horizontal dpi-value so that the widths of both images match and then you adjust the vertical dpi-value so that the heights match (or conversely).&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Task&lt;/h2&gt;Say we've got a pdf-file with two images in it called "a.pdf" and we want a new pdf with the first image in it&lt;br /&gt;with its&amp;nbsp; size of&amp;nbsp; 75% the original one.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Steps&lt;/h2&gt;To extract&amp;nbsp; the image we call pdfimage&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ pdfimages a.pdf&amp;nbsp;&amp;nbsp; image&lt;/pre&gt;This will create two files: "image-000.ppm" and "image-001.ppm"&amp;nbsp; &lt;br /&gt;&amp;nbsp;"image" is just a prefix for the filenames each extracted image is saved in&lt;br /&gt;&lt;br /&gt;To determine the dpi-value of the image we have to create a pdf document with the image in it and with a guessed dpi-value and adjust it until the images have the same size in the pdf viewer (as explained in detail above).&lt;br /&gt;&lt;br /&gt;To associate a dpi-value to the image we have to convert it into a jpeg and write it into it's header - since neither&amp;nbsp; the \includegraphics command supports a dpi-value argument nor does pdflatex support the "ppm" image type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ convert image-000.ppm -quality 100% -density 160x160 image1.jpg&lt;/pre&gt;Here we converted the image into a jpeg with a dpi-value (density) of 160 and nearly lossless compression - quality 100%.&lt;br /&gt;&lt;br /&gt;The latex document to create a pdf document is very simple, it just includes the bare scaffold to include an image:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span class="Statement"&gt;\documentclass&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="PreProc"&gt;scrartcl&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;\usepackage&lt;/span&gt;&lt;span class="Special"&gt;[&lt;/span&gt;&lt;span class="Constant"&gt;utf8&lt;/span&gt;&lt;span class="Special"&gt;]&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="Special"&gt;inputenc&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;\usepackage&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="Special"&gt;graphicx&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="PreProc"&gt;\begin{document}&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;\begin&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="PreProc"&gt;figure&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;\includegraphics&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="Special"&gt;image1.jpg&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;\end&lt;/span&gt;&lt;span class="Special"&gt;{&lt;/span&gt;&lt;span class="PreProc"&gt;figure&lt;/span&gt;&lt;span class="Special"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class="PreProc"&gt;\end{document}&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&amp;nbsp;Save this file under "b.tex". To create the pdf document "b.pdf" simply write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ pdflatex b.tex&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;After you have successfuly determined the dpi-value(s) of the image you can now go on to the final step of resizing it:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ convert image-000.ppm -quality 100% -density 160x160 -scale 75% image1.jpg&lt;/your&gt;&lt;br /&gt;&lt;/pre&gt;It's actually the same as above with a new argument "-scale 75%" - which rescales the image to 75% its original (pixel-)size.&lt;br /&gt;&lt;br /&gt;Again run&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;$ pdflatex b.tex&lt;br /&gt;&lt;/pre&gt;and you should have now a pdf document with a 75% sized image of its original image.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-345727663041276666?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/345727663041276666/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=345727663041276666' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/345727663041276666'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/345727663041276666'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2009/09/resizing-images-in-pdf-document.html' title='Resizing images in a pdf document'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-4959241425249734981</id><published>2009-07-04T11:40:00.000-07:00</published><updated>2009-09-21T05:57:28.631-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vim'/><category scheme='http://www.blogger.com/atom/ns#' term='mac osx'/><category scheme='http://www.blogger.com/atom/ns#' term='bash'/><category scheme='http://www.blogger.com/atom/ns#' term='syntax highlighting'/><category scheme='http://www.blogger.com/atom/ns#' term='pretty printer'/><category scheme='http://www.blogger.com/atom/ns#' term='code beautifier'/><category scheme='http://www.blogger.com/atom/ns#' term='shell'/><category scheme='http://www.blogger.com/atom/ns#' term='html'/><category scheme='http://www.blogger.com/atom/ns#' term='colorize'/><title type='text'>Shell script pretty printer</title><content type='html'>If your favorite syntax highlighter doesn't support html output and/or your language, chances are VIm does.&lt;br /&gt;&lt;br /&gt;VIm is a highly customizable text editor in functionality arguably only comparable to emacs (but never tried it!) &lt;br /&gt;&lt;br /&gt;If your script gets highlighted correctly in VIm you can easly export it to an html page for use in your blog, with the script &lt;a href="http://codesnippets.joyent.com/posts/show/2073"&gt;Colorize&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;By default this script is Mac OS/X and only highlights shell scripts (bash) but it's easily adaptable to every supported language.&lt;br /&gt;&lt;br /&gt;It uses the 2html.vim plugin, which features:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;XHTML/HTML 4 support&lt;br /&gt;&lt;/li&gt;&lt;li&gt;CSS styles for highlighting&lt;/li&gt;&lt;li&gt;Line numbering&lt;/li&gt;&lt;li&gt;Folding&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The script makes use of the "-s-ex" option for vim (for more info open vim and type ":help -s-ex").&lt;br /&gt;If you call 2html.vim interactively and the colors of the output html file don't fit, try setting&lt;br /&gt;":set t_Co=256" before exporting. This tells vim that the terminal supports 256 colors.&lt;br /&gt;&lt;br /&gt;Here's an example for an interactive vim-session, with options set for css-styles, xhtml, line numbering, unfolding, tabstop expanding and an unintrusive colorscheme "peachpuff" (also used on this site). For generating the html code simply write ":run! syntax/2html.vim" and then save the new window.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;vim &lt;span class="Special"&gt;-e&lt;/span&gt;  &lt;span class="Statement"&gt;\&lt;/span&gt;&lt;br /&gt;-S &lt;span class="Statement"&gt;&amp;lt;&lt;/span&gt;&lt;span class="Statement"&gt;(&lt;/span&gt;&lt;br /&gt;   &lt;span class="Statement"&gt;echo&lt;/span&gt;&lt;span class="Constant"&gt; &lt;/span&gt;&lt;span class="Statement"&gt;'&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      set nocompatible&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      set t_Co=256&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      let use_xhtml=1&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      let html_number_lines=1&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      let html_use_css=1&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      let html_ignore_folding=1&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      set expandtabs&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      set tabstop=4&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      set background=light&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      syntax on&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      colorscheme peachpuff&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;      vi&lt;/span&gt;&lt;br /&gt;&lt;span class="Constant"&gt;   &lt;/span&gt;&lt;span class="Statement"&gt;'&lt;/span&gt;&lt;br /&gt;&lt;span class="Statement"&gt;)&lt;/span&gt; &lt;span class="Statement"&gt;&amp;lt;&lt;/span&gt;your_script_file&lt;span class="Statement"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;b&gt;Note&lt;/b&gt;: As of Vim 7.0 enabling line numbering causes an error for Latex files, which results in incomplete formating.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-4959241425249734981?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/4959241425249734981/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=4959241425249734981' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/4959241425249734981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/4959241425249734981'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2009/07/shell-script-pretty-printer.html' title='Shell script pretty printer'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-369850831529544476</id><published>2009-07-03T10:19:00.000-07:00</published><updated>2009-07-03T11:25:15.084-07:00</updated><title type='text'>Deep file encoding converter</title><content type='html'>So let's say we copied directory "Miguel Bosé"  containing some mp3-files from a fat32 partition to linux one (utf-8).&lt;br /&gt;&lt;br /&gt;In the terminal it will display like:&lt;br /&gt;Miguel Bos�&lt;br /&gt;Miguel Bos�/CD 1&lt;br /&gt;Miguel Bos�/CD 1/14 - Si... Piensa En M�.mp3&lt;br /&gt;Miguel Bos�/CD 1/13 - Never Gonna Fall In Love Again.mp3&lt;br /&gt;Miguel Bos�/CD 1/08 - Te Amar�.mp3&lt;br /&gt;Miguel Bos�/CD 1/09 - M�rchate Ya.mp3&lt;br /&gt;...&lt;br /&gt;Miguel Bos�/CD 2&lt;br /&gt;Miguel Bos�/CD 2/19 - Te Dir�.mp3&lt;br /&gt;Miguel Bos�/CD 2/23 - Voy A Ganar.mp3&lt;br /&gt;Miguel Bos�/CD 2/16 - Sevilla.mp3&lt;br /&gt;Miguel Bos�/CD 2/25 - Se�or Padre.mp3&lt;br /&gt;....&lt;br /&gt;The problem is - as usual - the enconding. Under windows it was iso8859-1 encoded, but under linux it's utf-8 - we have to convert the filenames/directorynames.&lt;br /&gt;&lt;br /&gt;Our first attempt probably would be a simple script like the following one:&lt;br /&gt;find ./Migu* | #We use the * wildcard, for we can't enter the character easily&lt;br /&gt;   while read f; do &lt;br /&gt;      fc=$(echo -n "$f" | iconv -f iso8859-1); #Convert filename&lt;br /&gt;      mv "$f" "$fc";&lt;br /&gt;   done;&lt;br /&gt;&lt;br /&gt;This script will fail on the first nested entry with a "file not found" error. Why?&lt;br /&gt;-Because first it does: mv "Miguel Bos�" "Miguel Bosé"&lt;br /&gt;Second:  mv "Miguel Bos�/CD 1" "Miguel Bosé/CD 1", which must fail because we just previously renamed the parent directory, so it actually should be: mv "Miguel Bosé/CD 1" "Miguel Bosé/CD 1" (which is by the way a no-op, and should be filtered - but I come to that in a minute...)&lt;br /&gt;&lt;br /&gt;Obviously the problem in the foregoing example was that the conversion of the parent directory wasn't conveyed to its children (inner files/directories). That's because we piped the output of the find command (which executed at the beginning) to the loop.&lt;br /&gt;So one remedy would be to reexecute the find every time we rename a directory - obviously we would have to keep track of our current position in the directory-hierarchy otherwise we would have conceived an endless loop - Not very desireable. This sounds complicated to you - Then you're right, there's actually a much simpler solution.&lt;br /&gt;&lt;br /&gt;Some handy tools:&lt;br /&gt;dirname &lt;filepath&gt; ... returns the directory of the given filepath&lt;br /&gt;i.e.: &lt;br /&gt;$ dirname "Miguel Bosé/CD 1" &lt;br /&gt;Miguel Bosé&lt;br /&gt;&lt;br /&gt;basename &lt;filepath&gt; ... returns the basename of the given filepath&lt;br /&gt;i.e.:&lt;br /&gt;$ dirname "Miguel Bosé/CD 1" &lt;br /&gt;CD 1&lt;br /&gt;&lt;br /&gt;hexdump &lt;file&gt; ... returns a formated hexadecimal representation of the file contents.&lt;br /&gt;A badly documented feature is the -e &lt;format&gt;  option, which lets you define the format of the output:  hexdump -e ' [iterations]/[byte_count] "[format string]" '&lt;br /&gt;&lt;br /&gt;Lets say we want to convert a string into its hexcode using the format \x??&lt;br /&gt;i.e. "Hello" to "\x68\x65\x6c\x6c\x6f" (one byte hex represantion)&lt;br /&gt;The hexdump command would be:&lt;br /&gt;hexdump -v -e '1/1 "\\\x"' -e '1/1 "%01x"' &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So all we have to do is rename the directories in a topological order (parents before children) - Thankfully find does exactly that.&lt;br /&gt;&lt;br /&gt;So our next attempt would be something like this:&lt;br /&gt;&lt;br /&gt;data=$(find Miguel*) #Fetch filepath list to be processed&lt;br /&gt;&lt;br /&gt;while [ "$data" != "" ]; do&lt;br /&gt;f=$(echo -n "$data" | head -n 1 | tr -d '\n'); #extrac first filepath from the list&lt;br /&gt;dir=$(dirname "$f" | tr -d '\n'); #extract dirname&lt;br /&gt;dirc=$(echo -n "$dir" | iconv -f "iso8859-1"); #dirc .. converted dirname&lt;br /&gt;to=$(echo -n "$f" | iconv -f "iso8859-1"); #converted filepath name&lt;br /&gt;from=$(echo -n "$f" | sed -e "s|$dir|$dirc|"); #partly converted filepath name&lt;br /&gt;mv "$from" "$to";&lt;br /&gt;data=$(echo -n "$data" | sed -e '1 d'); #Remove the currently processed filepath from the list&lt;br /&gt;done&lt;br /&gt;&lt;br /&gt;Unfortunately it will also crash on the second renaming as before, but for a different reason ;)&lt;br /&gt;The reason is this limited/buggy sed command:&lt;br /&gt;s/Miguel Bos�/Miguel Bosé/ somehow can't/doesn't allow to match the unrecognized "�" (0xe9 ... é in iso8859-1) in "Miguel Bos�".&lt;br /&gt;&lt;br /&gt;So the next attempt was to convert "Miguel Bos�" into its hex representation to bypass the above checking -  Fiddlesticks!&lt;br /&gt;Sed is even so "intelligent" to try to interpret the characters regex-meanings.&lt;br /&gt;So if you have a dot "." in hex "\x2e" and you make a "sed -e 's/\x2e/A/'" it will replace all characters with "A" !!!.&lt;br /&gt;&lt;br /&gt;Fortunately perl puts things right: perl -pe "s|&lt;pattern&gt;|&lt;replace&gt;|", is able to make a binary match with a hex representation as a pattern:&lt;br /&gt;$ echo "This is . hack" | perl -pe "s|\x2e|a|"&lt;br /&gt;This is a hack&lt;br /&gt;&lt;br /&gt;The revised script is:&lt;br /&gt;&lt;br /&gt;data=$(find Miguel*) #Fetch filepath list to be processed&lt;br /&gt;enc="iso8859-1"&lt;br /&gt;&lt;br /&gt;while [ "$data" != "" ]; do&lt;br /&gt;f=$(echo -n "$data" | head -n 1 | tr -d '\n'); #extrac first filepath from the list&lt;br /&gt;#extract dirname and convert to hex representation&lt;br /&gt;dir=$(dirname "$f" | tr -d '\n' |  hexdump -v -e '1/1 "\\\x"' -e '1/1 "%01x"');&lt;br /&gt;#extract dirname and convert to utf-8&lt;br /&gt;dirc=$(dirname "$f" | tr -d '\n' | iconv -f "$enc");&lt;br /&gt;#converted filepath name&lt;br /&gt;to=$(echo -n "$f" | iconv -f "$enc");&lt;br /&gt;#from contains replaced dirname-part&lt;br /&gt;from=$(echo -n "$f" | perl -pe "s|$dir|$dirc|");&lt;br /&gt;if [ "$from" != "$to" ]; then #Only try to convert if filepath has changed&lt;br /&gt;  #Convert&lt;br /&gt;  mv "$from" "$to";&lt;br /&gt;fi&lt;br /&gt;data=$(echo -n "$data" | sed -e '1 d'); #Remove the currently processed filepath from the list&lt;br /&gt;done&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-369850831529544476?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/369850831529544476/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=369850831529544476' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/369850831529544476'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/369850831529544476'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2009/07/deep-file-encoding-converter.html' title='Deep file encoding converter'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-496037739871957944</id><published>2009-04-29T13:04:00.000-07:00</published><updated>2009-06-16T17:05:08.146-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='enconding'/><category scheme='http://www.blogger.com/atom/ns#' term='iconv'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='converting'/><category scheme='http://www.blogger.com/atom/ns#' term='cp1251'/><title type='text'>iconv</title><content type='html'>iconv is a Linux utility which converts data from one encoding to another. Under Linux textual data is stored in Unicode-encoding (UTF-8) - because it supports all charactersets. So for textual data to be displayed correctly in most programs they have - if not already - to be converted from the source encoding (i.e.: cp1251, cp866 ... most popular cyrillic enconding) to Unicode - here &lt;span style="font-style:italic;"&gt;iconv&lt;/span&gt; comes in handy.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Invoke options:&lt;/span&gt;&lt;br /&gt;-l ... Lists all available encodings (also pseudonyms, like latin1, cyrillic,...)&lt;br /&gt;-f ... the encoding to convert from&lt;br /&gt;-t ... the encoding to convert to (default: utf8)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Getting started&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Often the the id3-tags of mp3-files are encoded in some strange encoding.&lt;br /&gt;Let's take a russian mp3-file "01-Posledny_Zakat.mp3", whose id3-tag I know is encoded in cp1251 (as mentioned above some cyrllic encoding).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;"But what if I don't know what encoding it is?"&lt;/span&gt; you might ask&lt;br /&gt;-Be patient, in the next chapter we will tackle that.&lt;br /&gt;&lt;br /&gt;id3 -Rl 01-Posledny_Zakat.mp3 &lt;br /&gt;Filename: 01-Posledny_Zakat.mp3&lt;br /&gt;Title: ��������� �����&lt;br /&gt;Artist: ����&lt;br /&gt;Album: ����������&lt;br /&gt;Year: 2006&lt;br /&gt;Genre: Heavy Metal (137)&lt;br /&gt;&lt;br /&gt;Converting it to unicode:&lt;br /&gt;&lt;br /&gt;id3 -R -l 01-Posledny_Zakat.mp3 | grep -i "title\|artist\|album" | iconv -f cp1251 -t utf-8 | cut -d ' ' -f 2- | (read title &amp;&amp; read artist &amp;&amp; read album &amp;&amp; echo "id3 -t '$title' -a '$artist' -A '$album'")&lt;br /&gt;&lt;br /&gt;(If you have cyrllic fonts installed you should now be able to se some cool characters)&lt;br /&gt;&lt;br /&gt;id3 -Rl 01-Posledny_Zakat.mp3 &lt;br /&gt;Filename: 01-Posledny_Zakat.mp3&lt;br /&gt;Title: Последний закат&lt;br /&gt;Artist: Ария&lt;br /&gt;Album: Армагеддон&lt;br /&gt;Year: 2006&lt;br /&gt;Genre: Heavy Metal (137)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Command for batch converting all mp3-files within a directory:&lt;br /&gt;&lt;br /&gt;id3 -R -l *.mp3 | grep -i "filename\|title\|artist\|album" | iconv -f cp1251 -t utf-8 | cut -d ' ' -f 2- | (while read filename; do read title &amp;&amp; read artist &amp;&amp; read album &amp;&amp; id3 -t "$title" -a "$artist" -A "$album" "$filename"; done)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Determine unknown encoding&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;A) Requires you to be able to recognize correctly encoded data (i.e. you should know to read the target language)&lt;/h3&gt;&lt;br /&gt;If you don't know in which encoding your data is, you can try a brute-force method, testing every (reasonable) encoding.&lt;br /&gt;Let's say we want to try all encodings starting with CP\d (where \d is a digit). These encompass by the way all Cyrillic encodings - so we've assumed our data is encoded en some cyrillic encoding.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;iconv -l | grep -i "^cp[0-9]" | sed -e 's§//§§' | while read i; do  in=$(find ./ -maxdepth 1 -type d -ctime -1); str=$(echo "$in" | iconv -f "$i" -t utf-8 2&gt;/dev/null);  if [ "$?" -ne "0" ];then continue; fi; echo "$str";  echo "Encoding: '$i'"; sleep 1; done&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;B) Requires you to know exactly what character should be displayed, instead of a wrong encoded&lt;/h3&gt;&lt;br /&gt;You've got a mp3-file displayed as:&lt;br /&gt;&lt;br /&gt;Medina Azahara - Caravana Espa�ola - 8 - Caravana Espa�ola.mp3&lt;br /&gt;&lt;br /&gt;You know that it should be:&lt;br /&gt;&lt;br /&gt;Medina Azahara - Caravana Española - 8 - Caravana Española.mp3&lt;br /&gt;&lt;br /&gt;Let's dump the binary-data: ls -l | hexdump -C *mp3&lt;br /&gt;-C .. Tells hexdump to output hexdata and ascii-encoded data simultaneously&lt;br /&gt;&lt;br /&gt;Excerpt of output looks like this:&lt;br /&gt;...&lt;br /&gt;000002d0  6f 6d 29 20 2d 20 43 61  72 61 76 61 6e 61 20 45  |om) - Caravana E|&lt;br /&gt;000002e0  73 70 61 &lt;span style="color:red"&gt;a4&lt;/span&gt; 6f 6c 61 20  2d 20 38 20 2d 20 43 61  |spa&lt;span style="color:red"&gt;.&lt;/span&gt;ola - 8 - Ca|&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;So we see that 0xa4 (hexadecimal) should be encoded as ñ.&lt;br /&gt;&lt;span style="font-style:italic;"&gt;(The assumption is of course that it is an 8-bit encoding.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now we can use google with search string "0xa4 ñ".&lt;br /&gt;With a little luck you get a site with the corresponding encoding used.&lt;br /&gt;(In this case it was cp850 "DOS latin1")&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Comparison to Method A:&lt;/span&gt;&lt;br /&gt;Advantage: Method A can take quite a lot of time if you don't know on which encodings to&lt;br /&gt;restrict your search.&lt;br /&gt;Disadvantage: No success guarantee.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-496037739871957944?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/496037739871957944/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=496037739871957944' title='1 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/496037739871957944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/496037739871957944'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2009/04/iconv.html' title='iconv'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-3854686726045455004</id><published>2009-04-29T13:00:00.000-07:00</published><updated>2009-04-29T13:32:16.626-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='library_path'/><category scheme='http://www.blogger.com/atom/ns#' term='pkg-config'/><category scheme='http://www.blogger.com/atom/ns#' term='configure'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='debian'/><category scheme='http://www.blogger.com/atom/ns#' term='library'/><title type='text'>Library Quirks (on debian)</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Installing libraries on exotic paths&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Because debian is very conservative regarding its package-management, new packages are likely not to be found in the repository.&lt;br /&gt;Chances are also low that you find precompiled deb packages on the web.&lt;br /&gt;&lt;br /&gt;So you have to compile the packages on your own.&lt;br /&gt;&lt;br /&gt;Following are some issues regarding the configure process for compilation from sources.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;1) Required libraries aren't found even though you've got them installed using (aptitude, apt-get, ...)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Solution: The configure scripts looks after the Header-files for the libraries (usually located in /usr/include, /usr/local/include). If you install a library with aptitude, apt-get, .. you only install the shared-objects files *.so (usually located in /usr/lib, /usr/local/lib), because only these files are needed for execution by other programms (these shared-objects are the same as *.dll files under windows and contain the actual executable-code).&lt;br /&gt;Header-files on the opposite are only needed for compilation of programms, which include those libraries, because they contain the structure-declarations, data-types, function-prototypes of the library.&lt;br /&gt;In order to get them for you already installed package, you have to install the packages ending in -dev (which stands for developer files).&lt;br /&gt;Example:&lt;br /&gt;Say you've got package 'libglib2.0-0' installed and configure yields the error 'glib2.0 ... not found'.&lt;br /&gt;All you have to do is install 'libglib2.0-dev': 'sudo apt-get install libglib2.0-dev'&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;2) Required libraries aren't found even though you've installed them and their header-files&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The problem is probably that you've installed the libraries into a non-standard location (specifically not /usr or /usr/local). This can be achived be appending to configure the argumen --prefix=&lt;new location&gt;.&lt;br /&gt;&lt;br /&gt;Say you want to install you're library into /usr/local/exotic, all you've gotta do is a ./confgire --prefix=/usr/local/exotic (and of course make &amp;&amp; make install).&lt;br /&gt;This is useful if you don't want to mess up your system with expirimental versions of libraries, because the dynamic linker (ld) chooses the highest subversion of your library available on your system (Actually the APIs and the behaviour of libraries mustn't change -- reality is a bit different).&lt;br /&gt;&lt;br /&gt;Back to the problem: Looking at the lines above the line 'Checking for &lt;library&gt; ... not found', if there is 'Checking for pkg-config ...' then you've just found the problem.&lt;br /&gt;&lt;br /&gt;Here an excerpt from 'man pkg-config':&lt;br /&gt;&lt;br /&gt;" The  pkg-config program is used to retrieve information about installed&lt;br /&gt;  libraries in the system.  It is typically  used  to  compile  and  link&lt;br /&gt;  against  one  or more libraries.  Here is a typical usage scenario in a&lt;br /&gt;  Makefile:&lt;br /&gt;&lt;br /&gt;    program: program.c&lt;br /&gt;       cc program.c pkg-config --cflags --libs gnomeui&lt;br /&gt;&lt;br /&gt;  pkg-config retrieves information about packages from  special  metadata&lt;br /&gt;  files. These files are named after the package, with the extension .pc.&lt;br /&gt;  By default, pkg-config looks in the directory prefix/lib/pkgconfig  for&lt;br /&gt;  these  files;  it  will  also  look in the colon-separated (on Windows,&lt;br /&gt;  semicolon-separated) list of  directories  specified  by  the  PKG_CON&lt;br /&gt;  FIG_PATH environment variable."&lt;br /&gt;&lt;br /&gt;Eureka! So the message '&lt;library&gt; ... not found' actually just means that pkg-config didn't find a corresponding *.pc file for the library. That's because pkg-config per default searchs for its *.pc files in /usr/lib/pkgconfig, /usr/local/lib/pkgconfig dirs. But our library has it's *.pc files in /usr/local/exotic/lib/pkgconfig dir, so as stated in the man-page &lt;br /&gt;all we have to tell pkg-config where to look for the *.pc files using the PKG_CONFIG_PATH variable.&lt;br /&gt;&lt;br /&gt;So all you have to do is (in bash):&lt;br /&gt;&lt;br /&gt;$ export PKG_CONFIG_PATH=/usr/local/exotic/lib/pkgconfig &lt;br /&gt;&lt;br /&gt;checking for GLIB - version &gt;= 2.17.6...&lt;br /&gt;*** 'pkg-config --modversion glib-2.0' returned 2.18.0, but GLIB (2.12.4)&lt;br /&gt;*** was found! If pkg-config was correct, then it is best&lt;br /&gt;*** to remove the old version of GLib. You may also be able to fix the error&lt;br /&gt;*** by modifying your LD_LIBRARY_PATH enviroment variable, or by editing&lt;br /&gt;*** /etc/ld.so.conf. Make sure you have run ldconfig if that is&lt;br /&gt;*** required on your system.&lt;br /&gt;*** If pkg-config was wrong, set the environment variable PKG_CONFIG_PATH&lt;br /&gt;*** to point to the correct configuration files&lt;br /&gt;no&lt;br /&gt;configure: error:&lt;br /&gt;*** GLIB 2.17.6 or better is required. The latest version of&lt;br /&gt;*** GLIB is always available from ftp://ftp.gtk.org/pub/gtk/.&lt;br /&gt;&lt;br /&gt;This means the header-files were correctly found, but the library itself wasn't (here it was taken the old one in std-location /usr/lib). The remedy is to set explicitly where to search for the Libraries through the LD_RUN_PATH variable:&lt;br /&gt;&lt;br /&gt;export LD_RUN_PATH=/usr/local/exotic/lib&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-3854686726045455004?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/3854686726045455004/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=3854686726045455004' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/3854686726045455004'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/3854686726045455004'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2009/04/library-quirks-on-debian.html' title='Library Quirks (on debian)'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-6672401558366258314</id><published>2008-07-01T13:50:00.000-07:00</published><updated>2008-07-01T17:34:20.077-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='deixto'/><category scheme='http://www.blogger.com/atom/ns#' term='www::mechanize'/><category scheme='http://www.blogger.com/atom/ns#' term='webscraper'/><category scheme='http://www.blogger.com/atom/ns#' term='web data extractor'/><title type='text'>Website extraction tools</title><content type='html'>I'm evaluating some website extraction tools with commandline support in order to use them with scripts:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://deixto.csd.auth.gr/"&gt;DEiXTo&lt;/a&gt; from the Computer Science Department of the Aristotle University of Thessaloniki is a GPL based, yet very powerful, web data extractor.&lt;br /&gt;&lt;br /&gt;It consists of 2 parts, first the GUI-based Windows only (quite a drawback)  query generator, which produces an XML-file - called a Wrapper project file - *.wpf, which describes what should be matched.&lt;br /&gt;The GUI has a built-in Webbrowser for selecting the visible elements of interest. Furthermore it supports Regex, neighborhood and a lot more...&lt;br /&gt;It's still Beta and has some  teething troubles. In some cases I suddenly have 2 "virtual roots" one of them which I can't remove.&lt;br /&gt;&lt;br /&gt;Second the commandline based data extractor, which gets fed by the WPF-File generated with the GUI. The extractor is under GPL, written in Perl, available for Windows and Linux and runs without installation. Outputs supported are: tab delimited, XML, RSS, CSV, Excel.&lt;br /&gt;Under the Hood it's mostly based on the XML::LibXML, WWW::Mechanize and Tree::Fast Perl modules.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-6672401558366258314?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/6672401558366258314/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=6672401558366258314' title='1 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/6672401558366258314'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/6672401558366258314'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2008/07/website-extraction-tools.html' title='Website extraction tools'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-8335490152005948102</id><published>2008-06-29T14:34:00.000-07:00</published><updated>2008-07-01T18:01:44.787-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='batch renaming'/><category scheme='http://www.blogger.com/atom/ns#' term='regular expressions'/><category scheme='http://www.blogger.com/atom/ns#' term='sed'/><title type='text'>Batch renaming of files using regular expressions</title><content type='html'>&lt;div style="text-align: center;"&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:130%;"&gt;1. Batch renaming of files&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Problem:&lt;/span&gt; You've got a lot of mp3-files containing a nasty string like "(www.nastysite.com)" and want to remove it from the name:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Code:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;find ./ -iname "*.mp3" |&lt;br /&gt;(&lt;br /&gt;while read i; do&lt;br /&gt;   m=$(&lt;br /&gt;        echo "$i" | sed -e 's/(www.*\.com) //'&lt;br /&gt;      );&lt;br /&gt;&lt;br /&gt;   mv "$i" "$m";&lt;br /&gt;done&lt;br /&gt;)&lt;br /&gt;&lt;br /&gt;or copy-paste version:&lt;br /&gt;&lt;br /&gt;find ./ -iname "*.mp3" | (while read i; do m=$(echo "$i" | sed -e 's/(www.*\.com) //' ); mv "$i" "$m";  done)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;Breakdown&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;1. find ./ -iname "*.mp3"  ... Feeds the fullpath of the mp3s in the current directory to stdin (needed as starting point)&lt;br /&gt;&lt;br /&gt;2. Pipe the "Mp3-list" to a little bash-script, which loops through every line of input&lt;br /&gt;&lt;br /&gt;First the name is read into variable 'i'.&lt;br /&gt;&lt;br /&gt;Second we pipe the name (through echo '$i') to sed, which does the removing of the nasty string through a regex-pattern&lt;br /&gt;'s/(www.*\.com) //' ... substitutes first occurrence of '(www.nastysite.com) ' through void.&lt;br /&gt;&lt;br /&gt;Then we store the result in variable 'm'.&lt;br /&gt;&lt;br /&gt;Last we rename the file by 'mv "$i" "$m"'&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-size:180%;"&gt;2. Batch renaming of id3-tags of mp3-files&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Problem:&lt;/span&gt; You want your mp3s to be indexed by a programm utilizing id3-tags. Sadly many mp3s have messed up id3-tags or no tags at all.&lt;br /&gt;      But we want at least the title to be displayed correctly, so we need to set it.&lt;br /&gt;      For that we derive it from the filename:&lt;br /&gt;       Say our mp3-files are named like this: 'Triana - El Patio - 3 - Abre la puerta niña.mp3".&lt;br /&gt;       It consists of 4 parts:&lt;br /&gt;        1. Artist: Triana&lt;br /&gt;        2.  Album: El Patio&lt;br /&gt;        3. Track#: 3&lt;br /&gt;        4.  Title: Abre la puerta niña&lt;br /&gt;      Now the title should be '3 - Abre la puerta niña'&lt;br /&gt;&lt;br /&gt;This little script does the trick. It's just a slight variation of the previous one, namely replacing the renaming part (mv "$i" "$m")&lt;br /&gt;through the retagging part (id3 -t "$m" "$i"):&lt;br /&gt;For that we utilize the cmd-tool id3 which mangles the id3-tag info of an mp3-file. The option '-t' sets the title, '-a' the Artist, '-A' the Album,...&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Code:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="codebox"&gt;&lt;br /&gt;&lt;pre&gt;&lt;tt&gt;&lt;font color="#000000"&gt;01:&lt;/font&gt; find &lt;font color="#990000"&gt;./&lt;/font&gt; -iname &lt;font color="#FF0000"&gt;"*.mp3"&lt;/font&gt; &lt;font color="#990000"&gt;|&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;02:&lt;/font&gt; &lt;font color="#990000"&gt;(&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;03:&lt;/font&gt;     &lt;b&gt;&lt;font color="#0000FF"&gt;while&lt;/font&gt;&lt;/b&gt; &lt;b&gt;&lt;font color="#0000FF"&gt;read&lt;/font&gt;&lt;/b&gt; i&lt;font color="#990000"&gt;;&lt;/font&gt; &lt;b&gt;&lt;font color="#0000FF"&gt;do&lt;/font&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;font color="#000000"&gt;04:&lt;/font&gt;         &lt;font color="#009900"&gt;m&lt;/font&gt;&lt;font color="#990000"&gt;=&lt;/font&gt;$&lt;font color="#990000"&gt;(&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;05:&lt;/font&gt;               echo &lt;font color="#FF0000"&gt;"$i"&lt;/font&gt; &lt;font color="#990000"&gt;|&lt;/font&gt; sed -e &lt;font color="#FF0000"&gt;'s/.*&lt;/font&gt;&lt;font color="#CC33CC"&gt;\(&lt;/font&gt;&lt;font color="#FF0000"&gt;[0-9] - .*&lt;/font&gt;&lt;font color="#CC33CC"&gt;\)\.&lt;/font&gt;&lt;font color="#FF0000"&gt;mp3/&lt;/font&gt;&lt;font color="#CC33CC"&gt;\1&lt;/font&gt;&lt;font color="#FF0000"&gt;/'&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;font color="#000000"&gt;06:&lt;/font&gt;           &lt;font color="#990000"&gt;);&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;07:&lt;/font&gt; &lt;i&gt;&lt;font color="#9A1900"&gt;#Output status&lt;/font&gt;&lt;/i&gt;&lt;br /&gt;&lt;font color="#000000"&gt;08:&lt;/font&gt;         echo &lt;font color="#FF0000"&gt;"$i: '$m'"&lt;/font&gt;&lt;font color="#990000"&gt;;&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;09:&lt;/font&gt; &lt;i&gt;&lt;font color="#9A1900"&gt;#Retag the title&lt;/font&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;font color="#000000"&gt;10:&lt;/font&gt;         id3 -t &lt;font color="#FF0000"&gt;"$m"&lt;/font&gt; &lt;font color="#FF0000"&gt;"$i"&lt;/font&gt;&lt;font color="#990000"&gt;;&lt;/font&gt;&lt;br /&gt;&lt;font color="#000000"&gt;11:&lt;/font&gt; &lt;br /&gt;&lt;font color="#000000"&gt;12:&lt;/font&gt;         &lt;b&gt;&lt;font color="#0000FF"&gt;done&lt;/font&gt;&lt;/b&gt;&lt;br /&gt;&lt;font color="#000000"&gt;13:&lt;/font&gt;     &lt;font color="#990000"&gt;)&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;/tt&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;or copy-paste version:&lt;br /&gt;&lt;br /&gt;find ./ -iname "*.mp3" | (while read i; do m=$(echo "$i" | sed -e 's/.*\([0-9] - .*\)\.mp3/\1/'); echo "$i: '$m'"; id3 -t "$m" "$i"; done)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-weight: bold;"&gt;Breakdown&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;The regex is a bit more tricky, because it uses grouping:&lt;br /&gt;&lt;br /&gt;s/.*\([0-9] - .*\)\.mp3/\1/&lt;br /&gt;&lt;br /&gt;.* is a greedy operator, that means it consumes as most as it can, as long as the whole expression still matches.&lt;br /&gt;&lt;br /&gt;\(...\) doesn't actually match anything, it's just a marker (group), so that we can reference the content (all within the brackets) by \1 later.&lt;br /&gt;&lt;br /&gt;[0-9] ... this is a character-class which matches only numbers from 0 through 9&lt;br /&gt;\. matches a single dot (Because . by default matches any single character)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-8335490152005948102?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/8335490152005948102/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=8335490152005948102' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/8335490152005948102'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/8335490152005948102'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2008/06/batch-renaming-of-files-using-regular.html' title='Batch renaming of files using regular expressions'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9084486178517287444.post-5221973132005576932</id><published>2008-06-29T13:58:00.000-07:00</published><updated>2009-04-29T13:34:22.395-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='palehui'/><category scheme='http://www.blogger.com/atom/ns#' term='nahuatl'/><category scheme='http://www.blogger.com/atom/ns#' term='Introduction'/><category scheme='http://www.blogger.com/atom/ns#' term='language'/><title type='text'>What does the blogname mean?</title><content type='html'>'palehui' is a &lt;a href="http://en.wikipedia.org/wiki/Nahuatl"&gt;nahuatl&lt;/a&gt; verb meaning (help, assist), that describes quite well what that blog is aimed at - helping me to archive minor things related to linux, my studies (physics, maths) and languages.&lt;br /&gt;...and most of all it's catchy and easy to remember.&lt;br /&gt;&lt;br /&gt;And no - I don't speak Nahuatl - but I'm interested in languages and some day came across that site &lt;a href="http://mexica.ohui.net/glosarios/2/"&gt;http://mexica.ohui.net&lt;/a&gt;  where I took the word from.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://mexica.ohui.net/glosarios/2/"&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9084486178517287444-5221973132005576932?l=palehui.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://palehui.blogspot.com/feeds/5221973132005576932/comments/default' title='Kommentare zum Post'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9084486178517287444&amp;postID=5221973132005576932' title='0 Kommentare'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/5221973132005576932'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9084486178517287444/posts/default/5221973132005576932'/><link rel='alternate' type='text/html' href='http://palehui.blogspot.com/2008/06/what-does-blogname-mean.html' title='What does the blogname mean?'/><author><name>picodeoro</name><uri>http://www.blogger.com/profile/06403044534403570100</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
