<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>output stream &#187; Linux</title>
	<atom:link href="http://duncanandmeg.org/blogs/code/category/linux/feed/" rel="self" type="application/rss+xml" />
	<link>http://duncanandmeg.org/blogs/code</link>
	<description>riotous events in amateur development</description>
	<lastBuildDate>Fri, 18 Feb 2011 21:28:42 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Lazy list management</title>
		<link>http://duncanandmeg.org/blogs/code/2007/11/12/lazy-list-management/</link>
		<comments>http://duncanandmeg.org/blogs/code/2007/11/12/lazy-list-management/#comments</comments>
		<pubDate>Mon, 12 Nov 2007 21:28:09 +0000</pubDate>
		<dc:creator>dtjohnso</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://duncanandmeg.org/blogs/code/2007/11/12/lazy-list-management/</guid>
		<description><![CDATA[I use a couple awk scripts to manage a list of students who come to church with me every week. When a student tells me he will be coming on Sunday, I put an asterisk (*) in front of his name in a static text file with everyone&#8217;s names. If they tell me they won&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>I use a couple awk scripts to manage a list of students who come to church with me every week. When a student tells me he will be coming on Sunday, I put an asterisk (*) in front of his name in a static text file with everyone&#8217;s names. If they tell me they won&#8217;t be coming, I put a minus sign (-) in front of their name so I remember that they said so. If for some reason I&#8217;m not sure, I put a question mark (?).</p>
<p>My main script spits out a list of all the students who are coming (i.e., have an asterisk). There&#8217;s no genius to it, I&#8217;m just logging it here for reference.</p>
<p>I also have another script that wipes all the attendance marks out of the main list file.<span id="more-10"></span></p>
<h2>Files</h2>
<p>All files are available here:
<ul>
<li>
<a href="http://duncanandmeg.org/blogs/code/wp-content/uploads/2007/11/studentlist.zip" title='List management files' class="lizip">List management files</a></li>
</ul>
<p>Source is posted below without comment since it&#8217;s basically self-explanatory.</p>
<h3>StudentList.bash</h3>
<pre><code class=\'prettyprint\'  class="prettyprint">awk 'substr($1,0,1) == "*" { print substr($0,2,length($0)) } ' students.txt &gt; studentsComing.txt
cat studentsComing.txt</code></pre>
<h3>WipeStudentList.bash</h3>
<pre><code class=\'prettyprint\'  class="prettyprint">awk -f Wipe.awk &lt;students.txt &gt;temp
cat temp &gt;students.txt
rm temp
./studentlist.bash</code></pre>
<h3>Wipe.awk</h3>
<pre><code class=\'prettyprint\'  class="prettyprint">{if ( substr($1,0,1) == "*" || substr($1,0,1) == "-" || substr($1,0,1) == "?" )
  { print substr($0,2,length($0))  }
 else
  { print $0 }
}</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://duncanandmeg.org/blogs/code/2007/11/12/lazy-list-management/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Processing barcode scanner data with awk and sed</title>
		<link>http://duncanandmeg.org/blogs/code/2007/08/17/processing-barcode-scanner-data-with-awk-and-sed/</link>
		<comments>http://duncanandmeg.org/blogs/code/2007/08/17/processing-barcode-scanner-data-with-awk-and-sed/#comments</comments>
		<pubDate>Fri, 17 Aug 2007 19:23:54 +0000</pubDate>
		<dc:creator>dtjohnso</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://duncanandmeg.org/blogs/code/2007/08/17/processing-barcode-scanner-data-with-awk-and-sed/</guid>
		<description><![CDATA[We had an interesting project pop up here at the Library where I work yesterday. Apparently, part of our inventory process here involves downloading text files with raw barcode data from our barcode scanners, extracting the barcode from amidst the other junk data that pads it in the file, and then loading a freshly formatted [...]]]></description>
			<content:encoded><![CDATA[<p>We had an interesting project pop up here at the Library where I work yesterday. Apparently, part of our inventory process here involves downloading text files with raw barcode data from our barcode scanners, extracting the barcode from amidst the other junk data that pads it in the file, and then loading a freshly formatted list into Millennium, our library&#8217;s catalogue software.</p>
<p>I&#8217;m not typically involved with inventory or the particulars of maintaining the Millennium catalogue, but I was called in to help with writing some bash scripts to facilitate the process.<span id="more-6"></span></p>
<h2>The data</h2>
<p>Let me begin by showing some sample data from our barcode scanners. The scanners store the barcode in text files, one barcode per line, with some interesting pad characters that I don&#8217;t understand and we don&#8217;t really want for this project.</p>
<h3>Data example A</h3>
<pre><code class=\'prettyprint\' >TXT&lt;font color="red"&gt;95053542&lt;/font&gt;95012010:12
TXT&lt;font color="red"&gt;95053534&lt;/font&gt;95012010:12
TXT&lt;font color="red"&gt;95053559&lt;/font&gt;95012010:12
TXT&lt;font color="red"&gt;95053567&lt;/font&gt;95012010:12
TXT&lt;font color="red"&gt;95053575&lt;/font&gt;95012010:12</code></pre>
<h3>Data example B</h3>
<pre><code class=\'prettyprint\' >0000030000000000&lt;font color="red"&gt;8016039R&lt;/font&gt;07051001:56
0000030000000000&lt;font color="red"&gt;8110727Q&lt;/font&gt;07051001:56
0000030000000000&lt;font color="red"&gt;84220078&lt;/font&gt;07051001:56
0000030000000000&lt;font color="red"&gt;8122772T&lt;/font&gt;07051001:56</code></pre>
<p>In both examples here, I&#8217;ve made the actual barcode red. The rest of the line is garbage data.</p>
<p>From this data, let me make three observations.
<ol>
<li>The characters preceding the barcode may be <i>n</i> characters long.</li>
<li>Our barcodes are always 8 characters long.<br/>(I already knew this, but needed to make it clear for this post)</li>
<li>The characters following the barcode appear to always be 11 characters long.</li>
</ol>
<h2>The output</h2>
<p>Our catalogue software likes to receive barcodes from text files with each barcode on a line prefixed with n: like so&#8230;</p>
<h3>Output example A</h3>
<pre><code class=\'prettyprint\' >n:&lt;font color="red"&gt;95053542&lt;/font&gt;
n:&lt;font color="red"&gt;95053534&lt;/font&gt;
n:&lt;font color="red"&gt;95053559&lt;/font&gt;
n:&lt;font color="red"&gt;95053567&lt;/font&gt;
n:&lt;font color="red"&gt;95053575&lt;/font&gt;</code></pre>
<h3>Output example B</h3>
<pre><code class=\'prettyprint\' >n:&lt;font color="red"&gt;8016039R&lt;/font&gt;
n:&lt;font color="red"&gt;8110727Q&lt;/font&gt;
n:&lt;font color="red"&gt;84220078&lt;/font&gt;
n:&lt;font color="red"&gt;8122772T&lt;/font&gt;</code></pre>
<h2>The problem</h2>
<p>It seems that folks who do this all the time used to do it by some sort of fiddly method of importing it into Excel and setting a field delimiter at fixed widths to get the barcode into a column by itself, and then outputting everything in the proper format for Millennium somehow. All very tricky, manual, and not much fun&#8230;</p>
<h2>The solution</h2>
<p>The approach we took with this problem was proposed to me by a co-worker (kudos to Bryan Tyson!) who is better versed in Linux and bash scripting than I, but my limited experience with tools like awk and sed really made it seem like one of the easiest solutions to me too.</p>
<p>The following is the bash script we wrote to do the hard work for us. Since I&#8217;m not terribly versed in shell scripting with awk and sed, this took a bit of finding, and we tried to comment the script heavily to make it legible in the future. Although awk and sed are powerful, they surely don&#8217;t win points for preventing code obfuscation.</p>
<pre><code class=\'prettyprint\'  class="prettyprint">#!/bin/bash

if (test $# = 2) then #If file names are entered as input params
  INFILE=$1 #store first param as input
  OUTFILE=$2 #store second param as output
else #prompt for filenames
  echo "******************************************"
  echo "Welcome to inventory at J.S. Mack Library!"
  echo "This script takes the barcode file from"
  echo "the scanner and formats it for the"
  echo "Millennium inventory program."
  echo "******************************************"
  echo ""

  #Ask user to enter filename to be processed

  echo "What file to process?"
  echo "Include the full path if the file is not"
  echo "in the same directory as this script."
  echo ""
  read INFILE

  echo ""
  echo "What file to save the reformatted results?"
  echo "Include the full path if the file is not"
  echo "in the same directory as this script."
  echo ""
  read OUTFILE
fi 

#Input files from the scanner may have variable length lines in the following
#format:
#  n chars prefix, 8 char barcode, 11 char postfix
#  n is set by the scanner's "Major Division" setting
#We want to cut out all but the 8 char barcode.
#First, we must determine the length of the prefix.
#We do this with awk to find the length of every line and subtract the barcode
#and postfix from the total, then pipe to sed to get the prefix found for the
#first line. This assumes that every line in the file is the same length.
#(Can the major division change in the middle of a scanner file? Let's hope not!)
PREFIX=`awk '{print length($0) - 8 - 11}' $INFILE | sed 1q`

#Following calculating the prefix length, we store two values to pass
#to the cut command later based on the prefix.
let BEGIN=$PREFIX+1 #begin on first char after prefix
let END=$PREFIX+8 #end last char of barcode

#The following executes the cut command and pushes output (1 barcode on each line)
#directly to sed.
#Millennium needs "n:" before each barcode. Using sed, we will insert this at
#beginning of each line and output to filename given by the user.
cut -c$BEGIN-$END $INFILE | sed -e 's/^/n:/' &gt; $OUTFILE

echo ""
echo "Your barcode file, ${INFILE}, has been reformatted and saved to ${OUTFILE}."</code></pre>
<h3>Conclusion</h3>
<p>Perhaps the most unique part of this project was that we needed to be able to run this in Windows, so I had to find out how to run bash shell scripts on a Windows box. We used <a href="http://www.cygwin.com/" class="liexternal">Cygwin</a> with some success.</p>
<p>Whether this solution is the most efficient is up for grabs. However, it works! If anyone has suggestions for improvements, please comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://duncanandmeg.org/blogs/code/2007/08/17/processing-barcode-scanner-data-with-awk-and-sed/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

