<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Allen J. Hall &#187; Linux</title>
	<atom:link href="http://www.allenjhall.com/content/tag/linux/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.allenjhall.com/content</link>
	<description>Materials Science &#38; Engineering, Productivity, and Life</description>
	<lastBuildDate>Fri, 27 Jan 2012 03:55:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Software For Scientists: Awk (&amp; OsX)</title>
		<link>http://www.allenjhall.com/content/2010/10/13/software-for-scientists-awk-osx/</link>
		<comments>http://www.allenjhall.com/content/2010/10/13/software-for-scientists-awk-osx/#comments</comments>
		<pubDate>Wed, 13 Oct 2010 17:45:26 +0000</pubDate>
		<dc:creator>Allen</dc:creator>
				<category><![CDATA[Materials Science and Engineering]]></category>
		<category><![CDATA[OpenSource]]></category>
		<category><![CDATA[OsX]]></category>
		<category><![CDATA[Research Work]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[How To]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Mac OsX]]></category>
		<category><![CDATA[Tip]]></category>

		<guid isPermaLink="false">http://www.allenjhall.com/content/?p=364</guid>
		<description><![CDATA[For years while Apple had proprietary system software, I was itching to get a Unix system underneath and have the ease of the windowing system. Well, after OsX was released, I was ecstatic. Why? Because of the ease of some tasks in Unix in comparison to other OS&#8217;s. This post is only one example of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.allenjhall.com/content/wp-content/TerminalApp1.png"><img src="http://www.allenjhall.com/content/wp-content/TerminalApp1.png" alt="" title="TerminalApp" width="156" height="176" class="alignleft size-full wp-image-401" /></a>For years while Apple had proprietary system software, I was itching to get a Unix system underneath and have the ease of the windowing system.  Well, after OsX was released, I was ecstatic.  Why?  Because of the ease of some tasks in Unix in comparison to other OS&#8217;s.  This post is only one example of what you can do if you do a bit of research into how to use your Mac.  For those who have un*x boxes, this will merely be a place-holder for a few AWK scripts for you.<br />
<span id="more-364"></span></p>
<p>One of the first research programs I worked on as an undergraduate was taking large amounts of data from an HPGC running a fixed-bed reactor.  There were many problems with the work: (1) since our filter materials were very good at filtering the chemicals used, the run-times to breakthrough were very long (which meant that I had to come into the lab every 4 hours and restart the machine), (2) even with the lowest data setting, we had gobs and gobs of data (gigabytes in a day we were used to kilobytes of data), and (3) the data was saved in the form of: A 1 B 2 C3 (return) D4 E5 F6 (return).  To the rescue was a friend of mine, Nicolas, who was an excellent coder.  He turned me onto AWK and it worked beautifully.  The code he came up with is more complex than the following, and I&#8217;ve misplaced the old code, so for now, let&#8217;s deal with a simple case&#8230;</p>
<p>Let&#8217;s say you have too much data (you set the machine wrong, or you can&#8217;t set the machine properly), and you don&#8217;t care about throwing away the data, as the trends you want to see aren&#8217;t on the order of the data you wish to pitch.  If you have thousands of data-points and you only need say every 5th line, or every other line- do you really put it in Excel and edit it manually?  If so, you shouldn&#8217;t get a pay-raise next quarter- you can save crap tons of time by using the right tools.</p>
<p>Enter <a href="http://www.freebsd.org/cgi/man.cgi?query=awk&#038;apropos=0&#038;sektion=0&#038;manpath=FreeBSD+8.1-RELEASE&#038;format=html">Awk</a> &#8211; a great command-line program available on almost every linux/unix computer (maybe yours is called Gawk).  [A huge book on how to use all the aspects of awk is available here: <a href="http://www.gnu.org/manual/gawk/gawk.html">The Gnu Awk User's Guide</a>.]</p>
<p>With a single one-liner of code in the terminal, you can accomplish your goal of reducing your data.  My pal Brandon wanted to <strong>keep only every 5th line</strong> from his data:</p>
<p><code>cat "$@" | awk -F, '{if (count++%5==0) print $0;}' >~/Desktop/AwkOutput5thLines.txt</code></p>
<p>This is the code I used within Automator to accomplish his data-throw-away needs.  (there was an error with line-endings I fix later on down in this post&#8230;)  cat &#8220;$@&#8221; uses the output of the choose-file automator task (as argument) to feed the file to the awk command (in -F mode for piping).  The count++ command is doing the dirty work.  Hat-tip to <a href="http://duanesbrain.blogspot.com/2006/03/unix-cat-every-nth-third-fourth-fifth.html">Duane&#8217;s Brain blog</a> for the awk portion of the code which does the dirty work here!  Details of how to pass strings as arguments came from <a href="http://hints.macworld.com/article.php?story=20070331120036768&#038;query=finder">this great post on MacOsXHints.com</a>.  </p>
<p>Another frequent problem is throwing away every other line.  Here&#8217;s the code to pitch every 2nd line:<br />
<code>cat "$@" | awk -F, 'NR % 2 == 1' > ~/Desktop/AwkOutput.txt</code></p>
<p>One of the errors my pal had with his data was the line-endings.  So, note that awk requires linefeeds (unix format text) in order to accomplish it&#8217;s goals.  A simple way to translate is to use <a href="http://www.freebsd.org/cgi/man.cgi?query=tr&#038;apropos=0&#038;sektion=0&#038;manpath=FreeBSD+8.1-RELEASE&#038;format=html">tr</a> (translate text) the line-endings into things unix understands.</p>
<p>So, I solved the text line-ending problems by adding the following code:<br />
<code>tr '\r' '\n' </code><br />
making the final code appear like this (within automator) (all in one line):<br />
<code>cat "$@" | tr '\r' '\n' | awk -F, '{if (count++%5==0) print $0;}' >~/Desktop/AwkOutput5thLines.txt</code></p>
<p>Some more great awk links you may find useful:</p>
<ul>
<li><a href="http://www.vectorsite.net/tsawk_3.html">Awk Examples</a></li>
<li><a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-three/">Famous Awk One Liners Explained</a></li>
</ul>
<p>Finally, to give you a few presents for dropping in to read this meager blog, here&#8217;s a finished Automator script as well as an application form of the script in case you need exactly every 5th line like my pal, Brandon.</p>
<ul>
<li><a href='http://www.allenjhall.com/content/wp-content/Keep-every-5th-line.zip'>Keep Every 5th Line Of My Data (ZIP file)</a></li>
<li><a href='http://www.allenjhall.com/content/wp-content/Keep-Every-5th-Line-with-encoding.app_.zip'>Keep Every 5th Line Of My Data With Encoding (ZIP file)</a></li>
<li><a href='http://www.allenjhall.com/content/wp-content/CutOutEvery2ndLine.zip'>Throw Away Every Other Line (ZIP file)</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.allenjhall.com/content/2010/10/13/software-for-scientists-awk-osx/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Software For Scientists: Engauge Digitizer</title>
		<link>http://www.allenjhall.com/content/2009/04/10/software-for-scientists-engauge-digitizer/</link>
		<comments>http://www.allenjhall.com/content/2009/04/10/software-for-scientists-engauge-digitizer/#comments</comments>
		<pubDate>Fri, 10 Apr 2009 07:03:36 +0000</pubDate>
		<dc:creator>Allen</dc:creator>
				<category><![CDATA[DataVisualization]]></category>
		<category><![CDATA[Materials Science and Engineering]]></category>
		<category><![CDATA[Matlab]]></category>
		<category><![CDATA[OpenSource]]></category>
		<category><![CDATA[Research Work]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Thesis Writing]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[Cross Platform]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[How To]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Mac OsX]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.allenjhall.com/content/?p=104</guid>
		<description><![CDATA[In the time I&#8217;ve been doing my research work at the Univ. of IL, I&#8217;ve come across a number of graphs from various past researchers, older papers, stuck on the side of machines (calibration curves), and even hand-drawn or chart-recorder graphs in my numerous projects.  The only major problem with those graphs I&#8217;ve found is [...]]]></description>
			<content:encoded><![CDATA[<p>In the time I&#8217;ve been doing my research work at the Univ. of IL, I&#8217;ve come across a number of graphs from various past researchers, older papers, stuck on the side of machines (calibration curves), and even hand-drawn or chart-recorder graphs in my numerous projects.  The only major problem with those graphs I&#8217;ve found is that they aren&#8217;t in a digital form for further use with other data (instrument response functions) or to include in your own work as a reference.  So, what to do?</p>
<p>Well, there&#8217;s an easy solution.  It&#8217;s not the perfect solution, as it&#8217;s a bit slow, I&#8217;ll get to that in a second, but it&#8217;s a great solution to the problem, and has worked for me a number of times now.  To top it off, it&#8217;s open-source, donation-ware, and cross-platform: <a href="http://digitizer.sourceforge.net/" target="_blank">Engauge Digitizer</a> (see post at<a href="http://lifehacker.com/253720/download-of-the-day-engauge-digitizer-windowslinux"> LifeHacker.com</a>).  Don&#8217;t let the website and lack of recent updates deter you.  Tools that can do what Engauge does are few and far between.  So, it is definitely worth a try.  Here&#8217;s an example of how I&#8217;ve used it just the other day (prompting this post- I&#8217;ve used it for years now, but the recent use reminded me I should share it with others).  [click "More" to see an example use and learn more]</p>
<p><span id="more-104"></span></p>
<p>So, pretend you&#8217;re doing Cryogenic Cathodoluminescence measurements with a Gatan MonoCL system, and you want to know what the efficiency curves are, and in fact, you may want to use them to alter your data to correct for the grating response.  Only problem?  You can&#8217;t find the actual data for the grating response with wavelength anywhere&#8230; except for this one gif&#8230;</p>
<p style="text-align: center;"><img class="size-medium wp-image-109 aligncenter" title="Bantham 83 Grating" src="http://www.allenjhall.com/content/wp-content/gr083r1u2-300x223.jpg" alt="Bantham 83 Grating" width="300" height="223" /></p>
<p>Great, so that&#8217;s the grating response (this is a Bentham 83, 1.2 micron blaze).  With a bit of work in Engauge Digitizer, and some simple plotting in MATLAB, you can turn that picture into this&#8230;</p>
<p style="text-align: center;"><img class="size-medium wp-image-110 aligncenter" title="830 lines/mm 1.2micron blaze grating bentham" src="http://www.allenjhall.com/content/wp-content/830linesmm12micronblazegratingbentham-300x225.jpg" alt="830 lines/mm 1.2micron blaze grating bentham" width="300" height="225" /></p>
<p>You say big woop- well, true, it&#8217;s not that big a deal until you need to use that curve to alter some of your data to correct for the 20% efficiency difference between say 1100 nm and 1600 nm.  [Digitizer can output your data as a comma delimited data file that you can import and use in say, MATLAB.]  Other uses I&#8217;ve come across is Data-Mining a micrograph for positional data of certain features/points of interest.  That positional data can then be compared to other micrograph data, or can be used to find nearest-neighbor distances etc.</p>
<p>So, give Engauge a try, and kudos goes to the author for a good program that he&#8217;s put out there for free!!  The only negative thing I have to say about it is that sometimes some functions are a bit counter intuitive, and the animating cursor etc., is a bit overboard- a color change for highlighting, or a thickness variation or something like that is a bit more intuitive.  In anycase, beggars can&#8217;t be choosers, and Engauge certainly delivers.</p>
<p style="text-align: center;"><img class="size-medium wp-image-112 aligncenter" title="Engauge Screenshot" src="http://www.allenjhall.com/content/wp-content/screenshot-300x223.jpg" alt="Engauge Screenshot" width="300" height="223" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.allenjhall.com/content/2009/04/10/software-for-scientists-engauge-digitizer/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
