<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>BrainBlog</title>
	<atom:link href="http://blogs.nopcode.org/brainstorm/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.nopcode.org/brainstorm</link>
	<description>braindumping myself</description>
	<lastBuildDate>Mon, 20 May 2013 09:41:25 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Pragmatic Python versioning via setuptools and git tags</title>
		<link>http://blogs.nopcode.org/brainstorm/2013/05/20/pragmatic-python-versioning-via-setuptools-and-git-tags/</link>
		<comments>http://blogs.nopcode.org/brainstorm/2013/05/20/pragmatic-python-versioning-via-setuptools-and-git-tags/#comments</comments>
		<pubDate>Mon, 20 May 2013 09:41:25 +0000</pubDate>
		<dc:creator>brainstorm</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[uppnex]]></category>

		<guid isPermaLink="false">http://blogs.nopcode.org/brainstorm/?p=725</guid>
		<description><![CDATA[Some PEP&#8216;s have revolved around the problem of software versioning and dependency tracking. So, in addition to having some blueprints such as those proposed by the Semantic Versioning guidelines, one needs specifics on how to integrate those practices in our day to day work with version control systems. Setuptools saves the day by introducing versioning [...]]]></description>
				<content:encoded><![CDATA[<p>Some <acronym title="Python Enhancement Proposal">PEP</acronym>&#8216;s have <a href="http://www.python.org/dev/peps/pep-0386/">revolved</a> <a href="http://www.python.org/dev/peps/pep-0413/">around</a> the problem of software versioning and <a href="http://www.python.org/dev/peps/pep-0440/">dependency tracking</a>.</p>
<p>So, in addition to having some blueprints such as those proposed by the <a href="http://semver.org/">Semantic Versioning</a> guidelines, one needs specifics on how to integrate those practices in our day to day work with version control systems.</p>
<p>Setuptools saves the day by introducing versioning via <a href="http://learn.github.com/p/tagging.html">git tags</a>. In a post by <a href="http://dcreager.net/2010/02/10/setuptools-git-version-numbers/">Douglas Creager</a> a strategy to use setuptools with git tags is devised. The workflow for tagging a new version results in:</p>
<ol>
<li>Tag your release via <em>git tag</em> if the changes are significant.</li>
<li>Run python setup.py install, to bump the version on the filesystem.</li>
<li>git push.</li>
</ol>
<p>The following code makes it happen:</p>
<pre class="brush: python; title: ; notranslate">
# Fetch version from git tags, and write to version.py.
# Also, when git is not available (PyPi package), use stored version.py.
version_py = os.path.join(os.path.dirname(__file__), 'version.py')

try:
    version_git = subprocess.check_output([&quot;git&quot;, &quot;describe&quot;]).rstrip()
except:
    with open(version_py, 'r') as fh:
        version_git = open(version_py).read().strip().split('=')[-1].replace('&quot;','')

version_msg = &quot;# Do not edit this file, pipeline versioning is governed by git tags&quot;
with open(version_py, 'w') as fh:
    fh.write(version_msg + os.linesep + &quot;__version__=&quot; + version_git)

setup(name=&quot;yourpythonpackage&quot;,
      version=&quot;{ver}&quot;.format(ver=version_git),
</pre>
<p>As an addition to the git tags workflow proposed by Douglas, the &#8216;__version__&#8217; attribute will be stored in version.py file. This allows the versions to be tracked even when our git repository is not available (i.e, via PyPi package installation), or when such a version needs to be queried from inside your own package.</p>
<p>Thanks <a href="http://mussolblog.wordpress.com/">Guillermo</a> and <a href="http://bcbio.wordpress.com/">Brad</a> for the feedback and suggestions on this strategy.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.nopcode.org/brainstorm/2013/05/20/pragmatic-python-versioning-via-setuptools-and-git-tags/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bad clouds, good clouds</title>
		<link>http://blogs.nopcode.org/brainstorm/2013/04/29/bad-clouds-good-clouds/</link>
		<comments>http://blogs.nopcode.org/brainstorm/2013/04/29/bad-clouds-good-clouds/#comments</comments>
		<pubDate>Mon, 29 Apr 2013 00:52:19 +0000</pubDate>
		<dc:creator>brainstorm</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[university]]></category>
		<category><![CDATA[uppnex]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://blogs.nopcode.org/brainstorm/?p=681</guid>
		<description><![CDATA[This is an on-demand blog post, none of the actors are real institutions nor people, anything resembling real life might be pure coincidence So there&#8217;s a day, that day when an organization realizes that there&#8217;s a real need to have a solid cloud platform as an official infrastructure offering. Admit it, we all have some [...]]]></description>
				<content:encoded><![CDATA[<p>This is an on-demand blog post, none of the actors are real institutions nor people, anything resembling real life might be pure coincidence <img src='http://blogs.nopcode.org/brainstorm/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>So there&#8217;s a day, that day when an organization realizes that there&#8217;s a real need to have a solid cloud platform as an official infrastructure offering. Admit it, we all have some idle cycles we could make better use of.</p>
<blockquote class="twitter-tweet"><p>Why the Rest of Us Need Virtualization Even If Facebook Doesn’t <a href="http://t.co/E3KdHCMLAu" title="http://feedproxy.google.com/~r/PuppetLabs/~3/tUgrKy9tTRk/">feedproxy.google.com/~r/PuppetLabs/…</a></p>
<p>&mdash; Roman Valls (@braincode) <a href="https://twitter.com/braincode/status/328053287620329472">April 27, 2013</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<h1>A bad cloud</h1>
<p>Then someone owning some computer resources types some commands frenetically in a console and <em>voilà</em>, a <strong>beta</strong> cloud service is born, a fiction dialog between <strong>user and cloud provider</strong> follows:</p>
<blockquote><p>
- This is great! I want to have an account on your new service, where can I get it?<br />
- Well, you have to come at our offices, we will <strong>scan your passport</strong>, have a <strong>1 hour long meeting</strong> and then give you an account.<br />
- Ermm, ok, I just want to use that service&#8230;
</p></blockquote>
<p>In the meeting, there&#8217;s an introduction on how to use the service, many people did not prepare their machines before the session and they get stuck by the overcomplicated client installation instructions, which involve installing <strong>pre-compiled binaries and config files inside a .tar.bz</strong> (<a href="http://en.wikipedia.org/wiki/Application_binary_interface" title="ABI">ABI</a> issues galore!). Next, you are given a password via SMS, &#8220;abc123&#8243;, which <strong>you cannot change</strong> (nor are encouraged to). You should explicitly ask the admins to change it for you.</p>
<p>Dirty secret: if they are not forced to change it, nobody ever does.</p>
<p>After editing some cloud templates text files, your first instance is up and running. Time to clone your <a href="http://cloudbiolinux.org/">CloudBioLinux</a> copy and get some bioinformatics software installed in it&#8230; Unfortunately, it does not take very long to discover that the base distribution is <strong>more than 3 releases old</strong>. The user emails the imaginary beta cloud support and says: </p>
<blockquote><p>
- Hi cloud-support! Is it possible to have the newest Ubuntu release as an image?<br />
- No it&#8217;s not at the moment.<br />
- Emm, ok, I tried to apt-get dist-upgrade but it just runs out of space, can I get more space in the VM to do that upgrade myself then? <strong>One cannot do much with 4GB of disk</strong> these days, you know.<br />
- No, it is not possible, you can use the 1TB NFS-mounted scratch space instead.<br />
- <strong>Why isn&#8217;t it possible?</strong> Anyway, I see no straightforward way to use that scratch as an extension of the OS, while the VM is running, and I cannot access the filesystem offline and move, say, /usr away without doing some hackish stuff involving squashfs, ramdisks, etc&#8230; this is actually giving me more headaches than is worth.<br />
- I&#8217;m sorry, we cannot bundle another distribution for you.
</p></blockquote>
<p>So what can we learn from that experience? What can a bad cloud do to become a better cloud?</p>
<ol>
<li>Distributing a <strong>readily installable and tested client CLI package</strong> for the most popular platforms instead of a precompiled <em>.tar.gz</em> would have cut down that <strong>1 hour long meeting to nil</strong>. Documentation should never be a substitute nor shortcut for a tested, directly installable package.</li>
<li>Distributing passwords, even via SMS, <strong>should adhere to basic <a href="http://en.wikipedia.org/wiki/Password_strength">good password policies</a></strong> at all times, even in beta services. Go <a href="https://code.google.com/p/google-authenticator/">double factor authentication</a> if you fancy it.</li>
<li>All services should be <strong>auto-provisioned</strong>. Asking sysadmins to perform routine operations like changing passwords should be off the table.</li>
<li>Dimensioning a cloud (disk, memory, network interfaces) is not an easy task if the users have wildly different needs, but at least, it should be possible to easily <strong>increase VM image space</strong>, within reasonable limits. Other metrics such as RAM, network interfaces, DNS records, mountpoints should be directly accessible to the user, auto-provisioned.
</li>
<li><strong>Creating new cloud images</strong> from vanilla OS&#8217;s <em>automatically</em> should be in place somehow and before launching the beta.</li>
</ol>
<p>One year passes, some more console typing and moving to new hardware resources should get the service a new face, ready for a second try.</p>
<p>The same issues arise, instead of automating the deployment of the whole cloud to other machines, it just has been moved to the new hardware. There is <strong><a href="https://github.com/puppetlabs/puppetlabs-opennebula">no evidence of automation</a> being done since last year</strong>.</p>
<h1>A better cloud</h1>
<p><a href="http://ivory.idyll.org/blog/automated-testing-and-research-software.html">Automated testing</a> is about software, since clouds are software, why not automate bits and pieces of the deployment until it becomes fully automatic? It is easier said than done, it takes a great deal of patience to go and:</p>
<ol>
<li>Build and test a cloud component.</li>
<li>Automate its deployment, testing it elsewhere.</li>
<li>Take the whole cloud stack down, recreating it again from scratch.</li>
<li>Automate basic user-side (stress)-testing: create instance, record a DNS change, attach new volumes, destroy instance, etc&#8230;</li>
</ol>
<p><strong>Automation and testing are hard</strong>, it takes time to get them right and not overfit your immediate environment. But look, those guys over there seem to have gotten it right:</p>
<blockquote><p>
- So you <strong>only need my public SSH key</strong>? That&#8217;s all? No meetings nor passport, fingerprints, blood samples or photos?<br />
- That&#8217;s exactly right, just login as root, <strong>break as much as you want in your own cloud</strong>, we can wipe your whole stuff out in less than 20 minutes. We&#8217;ll of course be <strong>gathering metrics from outside</strong>, just in case we detect something bad coming out your instance(s). <strong>We don&#8217;t want to get in your way</strong>.<br />
- Nice! What about having the latest Ubuntu release&#8230;<br />
- We just provisioned it as we speak (true story).<br />
- I&#8217;m launching some hadoop jobs right now. <a href="https://github.com/guillermo-carrasco/hadoop">It took me a few minutes to provision the nodes</a>. thank you guys, you&#8217;re awesome!
</p></blockquote>
<p><span id="more-681"></span></p>
<h1>Look around, good (and some free beer) clouds!</h1>
<p>Now, while our inner clouds get better, how can I get my idea without spending much money on it. What should we tell our users? <strong>They want some cloud action and they want it now</strong>. While <a href="http://www.openstack.org/">OpenStacks</a> get stacked <a href="http://www.opscode.com/press-releases/opscode-announces-chef-for-openstack/">properly inside your walls</a>, you can go and explore the outside world a bit, we might learn a lot from them!</p>
<p>Those impatient early birds can try locally with something like <a href="http://www.cloudifysource.org/">Cloudify</a>&#8230; Ooops, <a href="https://github.com/opscode">Chef recipes</a> do not work out of the box if you have an Apple user.</p>
<blockquote class="twitter-tweet"><p>Giving a try to @<a href="https://twitter.com/cloudifysource">cloudifysource</a>… <a href="https://twitter.com/search/%23fail">#fail</a> when installing <a href="https://twitter.com/search/%23Chef">#Chef</a> in os.getVendor()==Apple… <a href="http://t.co/iCTSkienR9" title="http://jtimberman.housepub.org/blog/2012/07/29/os-x-workstation-management-with-chef/">jtimberman.housepub.org/blog/2012/07/2…</a></p>
<p>&mdash; Roman Valls (@braincode) <a href="https://twitter.com/braincode/status/315120968236412928">March 22, 2013</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>There are some hiccups that could be circumvented with Virtualbox though:</p>
<blockquote class="twitter-tweet"><p>@<a href="https://twitter.com/braincode">braincode</a> check this project from @<a href="https://twitter.com/fastconnect">fastconnect</a> <a href="http://t.co/mk72kraH21" title="http://ow.ly/1TRJsg">ow.ly/1TRJsg</a>. We plan to have this as a native capability in Cloudify as well</p>
<p>&mdash; Uri Cohen (@uri1803) <a href="https://twitter.com/uri1803/status/315195558610485248">March 22, 2013</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>But wait, there are other alternatives that don&#8217;t depend on your workstation/laptop whacky configurations, they might not be as powerful (HPCCloud users, move along), but enough for starters. Heroku is a classic, with an excellent CLI tool and <strong>mandatory version control</strong> that can get you started very quickly in the cloud <acronym title="Platform as a Service">PaaS</acronym> arena:</p>
<p><a href="https://get.heroku.com/"><img src="http://3.bp.blogspot.com/-bxj9LtU6bJE/UGl0Idls8_I/AAAAAAAAAFo/ld8Pk5OWGGE/s1600/heroku-logo-white.jpg" alt="heroku logo" width="186" height="43" class="alignnone"/></a></p>
<p>Then, another interesting platform is dotcloud, with its recently published cloud stack/backend, following the same steps as heroku:</p>
<p><a href="https://www.dotcloud.com/"><img src="https://www.dotcloud.com/static/img/logo.png" width="186" height="43" class="alignnone"/></a></p>
<p>Until a few days ago, free for <strong>best effort</strong> instances were offered, which meant &#8220;put your application here, we will run it if we have enough resources left&#8221;. Now you can build your <a href="http://blog.dotcloud.com/new-sandbox">own dotcloud</a> in your organization, if <a href="http://en.wikipedia.org/wiki/Platform_as_a_service">PaaS</a> is what you need.</p>
<h1>Conclusion</h1>
<p>Being a <acronym title="Infrastructure as a Service">IaaS</acronym> cloud provider can be a bumpy experience, it is not trivial. In this post I wanted to outline three different worldviews: the hurried cloud in-house service provider, the more automated in-house provider and the industry-quality for-pay provider. The goal is clear yet a tricky one to implement in reality. We want to reach a point where:</p>
<ol>
<li><strong>No human intervention</strong> and support is required to instantiate basic and small clouds. <strong>Obsessive automation culture</strong> is paramount to reach that stage.</li>
<li>Enable user auto-provisioning as a result of the first point.</li>
<li>Commonly used services wrapped through <strong>public</strong> <a href="https://github.com/opscode/cookbooks"><strong>deployment recipes</strong></a> (cloudify or chef are examples). Encourage <a href="http://ivory.idyll.org/blog/research-software-reuse.html">re-use and improvement of those recipes</a>.</li>
<li>Enforce good practices in the developer/user side such as <strong>version control as the only way to deploy applications</strong>.</li>
<li>Enforce security <strong>from outside the instances</strong>, <strong>don&#8217;t get in the way of the users</strong> unless bad things are happening or about to happen. Make sure you have implemented <a href="http://aws.amazon.com/security">some non-intrusive security measures</a> before announcing the cloud service.</li>
<li>Provide a consistent <strong>API/CLI toolbox/toolbelt</strong> to control various aspects of the instances, like the industry-quality providers have.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://blogs.nopcode.org/brainstorm/2013/04/29/bad-clouds-good-clouds/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Automated Python education via unit testing and Travis-CI</title>
		<link>http://blogs.nopcode.org/brainstorm/2013/03/04/automated-python-education-via-unit-testing-and-travis-ci/</link>
		<comments>http://blogs.nopcode.org/brainstorm/2013/03/04/automated-python-education-via-unit-testing-and-travis-ci/#comments</comments>
		<pubDate>Mon, 04 Mar 2013 00:24:41 +0000</pubDate>
		<dc:creator>brainstorm</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[KTH]]></category>
		<category><![CDATA[university]]></category>
		<category><![CDATA[unix]]></category>
		<category><![CDATA[uppnex]]></category>

		<guid isPermaLink="false">http://blogs.nopcode.org/brainstorm/?p=663</guid>
		<description><![CDATA[Sometimes education can be a daunting process. It is quite obvious from the student side, we all have gone through exercises, corrections, learning what we did wrong on some of them, fixing and learning from those errors, rinse and repeat. That&#8217;s how it generally works. On the teacher&#8217;s side, correcting assignments is easy and unbiased [...]]]></description>
				<content:encoded><![CDATA[<p><a href="https://github.com/pythonkurs"><img src="https://secure.gravatar.com/avatar/fa4bdc52c6751783a1bf9f7c084acea3?s=420&#038;d=https://a248.e.akamai.net/assets.github.com%2Fimages%2Fgravatars%2Fgravatar-org-420.png" width="420" height="420" class /></a></p>
<p>Sometimes <strong>education can be a daunting process</strong>. It is quite obvious from the student side, we all have gone through exercises, corrections, learning what we did wrong on some of them, fixing and learning from those errors, rinse and repeat. That&#8217;s how it generally works.</p>
<p>On the teacher&#8217;s side, correcting assignments is easy and unbiased unless the number of students is considerably large. At<br />
one of the sessions of our now official <a href="http://www.kth.se/" title="Kungliga Tekniska Högskolan" target="_blank">KTH</a> course <a href="http://www.kth.se/student/kurser/kurs/DD3436?l=en" title="Scientific Programming in Python for Computational Biology" target="_blank">&#8220;DD3436 Scientific Programming in Python for Computational Biology&#8221;</a> I was given the task to hold a session on <strong>software testing and continuous integration in Python</strong>&#8230; for around 50 students.</p>
<p><span id="more-663"></span></p>
<p>So I wanted to find a simple way to <strong>teach Python</strong> without going through the old beaten track of the boring fibonnacci function and endless lists of exercises. Something like little incremental goals that kept students hooked until they finished the training for the session.</p>
<p>Then I discovered the <a href="https://github.com/brainstorm/python_koans" title="python koans">python koans</a> by <a href="https://github.com/gregmalcolm">Greg Malcolm</a>. By themselves, they are a very good way to learn Python basics and unit testing.</p>
<p>But what would happen if those pykoans respected standard UNIX <a href="https://github.com/gregmalcolm/python_koans/pull/40">exit codes</a> and had a basic <a href="https://github.com/gregmalcolm/python_koans/pull/41">Travis-CI integration</a>? Indeed, that brings us testing and continuous integration for free!</p>
<p>And then, as a teacher, what if there were easy means to <a href="https://github.com/brainstorm/pytravis/blob/master/eval_pykoans.py">monitor</a> students progress and <a href="https://github.com/brainstorm/pytravis/blob/master/koans_completed.py">correct</a> assignments via some <a href="https://api.travis-ci.org/docs/">API&#8217;s and JSON</a>?</p>
<p>The end result is that teaching basic Python along with <strong>good programming practices</strong> such as <acronym title="Test Driven Development">TDD</acronym> and <acronym title="Continuous Integration">CI</acronym> is much easier nowadays. Teachers get to help students better without <strong>correcting tons of assignments manually</strong> and students get <a href="https://travis-ci.org/hugerth/python_koans">green TDD lights</a> as they go.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.nopcode.org/brainstorm/2013/03/04/automated-python-education-via-unit-testing-and-travis-ci/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The &#8220;module system&#8221;: The good, the bad and the ugly</title>
		<link>http://blogs.nopcode.org/brainstorm/2011/11/23/module-system-bad-and-ugly/</link>
		<comments>http://blogs.nopcode.org/brainstorm/2011/11/23/module-system-bad-and-ugly/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 23:41:44 +0000</pubDate>
		<dc:creator>brainstorm</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[university]]></category>
		<category><![CDATA[unix]]></category>
		<category><![CDATA[uppnex]]></category>

		<guid isPermaLink="false">http://blogs.nopcode.org/brainstorm/?p=579</guid>
		<description><![CDATA[Dealing with software package management can be a daunting task, even for experienced sysadmins. From the long forgotten graft, going through the modern and insanely tweakable portage to the (allegedly) multiplatform pkgsrc or the very promising xbps, several have tried to build an easy to use, community-driven, simple, with good dependency-handling, optimal, reliable, generic and [...]]]></description>
				<content:encoded><![CDATA[<p>Dealing with software <a href="http://ianmurdock.com/solaris/how-package-management-changed-everything/">package management</a> can be a daunting task, even for experienced sysadmins. From the long forgotten <a href="http://peters.gormand.com.au/Home/tools/graft/graft-html">graft</a>, going through the modern and insanely tweakable <a href="http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&#038;chap=1">portage</a> to the (allegedly) multiplatform <a href="http://www.netbsd.org/docs/software/packages.html">pkgsrc</a> or the very promising <a href="http://code.google.com/p/xbps/">xbps</a>, several have tried to build an easy to use, community-driven, simple, with good dependency-handling, optimal, reliable, generic and portable packaging system.</p>
<p>In my experience on both sides of the iron, as a sysadmin and developer, <strong>none</strong> of them work as one would like to.</p>
<p>But first, let&#8217;s explore what several <acronym title='High Performance Computing'>HPC</acronym> centers have adopted as a solution and why&#8230; and most importantly, how to fix it eventually.</p>
<p><span id="more-579"></span></p>
<h2>The good</h2>
<p>Widely used in different research facilities, the <a href="http://modules.sf.net/">module system</a> allows users to choose different versions of several software. The approach is simple, just type <strong>&#8220;module load program/1.0&#8243;</strong> and off you go.</p>
<p>On the sysadmin side, it&#8217;s the <strong>same old familiar spell &#8220;tar xvfz &#038;&#038; make &#038;&#038; make install&#8221;</strong>, and a &#8220;vim program&#8221; to define the module script that will set PATH, LD_LIBRARY or other variables and whatnot.</p>
<p>Consequently, the time required to wrap a software is minimal, conferring sysadmins with speedy quick hack superpowers. After all, <a href="http://blog.jcuff.net/2011/04/velocity-in-research-computing-really.html">velocity in research does matter</a>, and <strong>getting things done</strong> to let research continue its way is mandatory.</p>
<p>Moreover, user-coded modules can be shared easily within the same cluster by simply tweaking MODULEPATH variable. What&#8217;s the catch ? <a href="http://en.wikipedia.org/wiki/Technical_debt">Technical debt</a> and most importantly, lack of <a href="http://www.opscode.com/chef/">automation</a>.</p>
<h2>The bad</h2>
<p>Software <strong>packaging is a time consuming task</strong> that shouldn&#8217;t be kept inside institutional cluster firewalls, <a href="https://github.com/scilifelab/modules.sf.net">but openly published</a>. Indeed, a single program could have been <strong>re-packaged</strong> a number of times on each academic cluster for each university department that has HPC resources. When new versions come up for each package the sysadmin has to take care of bumping it by creating directories and additional recipes. How does one justify this time investment ? <strong>It just doesn&#8217;t scale</strong>. Skip to &#8220;solutions?&#8221; section for some relief.</p>
<p>From a technical perspective, using package systems that are not shipped with the operating system introduces an <strong>extra layer of complexity</strong>. More often than not, updates on the base distribution will break compiled programs that rely on old libraries. <strong>Stacking package managers should be considered harmful</strong>.</p>
<p>Ruby, python and perl have their own mature way to install packages for most UNIXes, stacking package managers by rpm-packaging python or ruby modules, has <a href="http://stakeventures.com/articles/2008/12/04/rubygem-is-from-mars-aptget-is-from-venus">several</a> <a href="http://www.b-list.org/weblog/2008/dec/14/packaging/">bad</a> consequences. Granted, there are some concerns on uniformity, updates and security, but those again can be solved by the individual package managers.</p>
<p>But getting back to the <a href="http://modules.sf.net">module system</a>, <strong>how well does it play with cloud computing</strong> ?</p>
<p>It doesn&#8217;t, thankfully !</p>
<p>One would have to install all the modules, and re-package the software for the virtual instances. In contrast, existing package systems, be it rpm, deb, <a href="http://pypi.python.org/pypi">pip</a>, <a href="http://rubygems.org/">gem</a> or <a href="http://clojars.org/">lein</a> solved that by themselves. On top of that, the module system will tweak crucial system variables such as $PATH or $LD_LIBRARY_PATH with bad side effects for <a href="http://blogs.nopcode.org/brainstorm/2011/06/23/how-to-install-python-modules-with-virtualenv-on-uppmax/">python virtual environments</a> or any other user-defined PATHs.</p>
<p>Lastly, from a human resources perspective, writing <a href="http://modules.sf.net">modules</a> does not <strong>add value or expertise to your IT toolbox</strong>. On the other hand, search engines have something to say when looking up search terms such as <a href="http://duckduckgo.com/?q=experience+packaging+software+rpm+deb+job">job experience packaging software rpm deb</a>. Actually, being involved in open source communities via packaging can give you some very valuable insights on how open source projects work.</p>
<h2>The ugly</h2>
<p>With aging and relatively unmantained software, some bugs arise. Under some circumstances here&#8217;s what occurs:</p>
<pre>
$ modulecmd bash purge
*** glibc detected *** modulecmd: free(): invalid next size (fast): 0x0000000001b88050 ***
======= Backtrace: =========
(...)

$ export MODULEPATH=AAAAAAAAAAAAAAAAAAAAAAAAA:AAAAAAAAAAAAAAAAAAAAA:/bin/bash &#038;&#038; modulecmd bash purge
    *** glibc detected *** modulecmd: corrupted double-linked list: 0x00000000009c4600 ***
</pre>
<p>I would like to light a candle for those who dare running modulecmd with <a href="http://en.wikipedia.org/wiki/Setuid">suid bit</a>. I can only think of <a href="http://isp.surfnet.fi/aktuelltbilder/schukken.jpg">one sysadmin</a> that could do that while being totally self-confident.</p>
<h2>Solutions?</h2>
<p>Here&#8217;s some brainstorming that might help in the long run:</p>
<ol>
<li>Instead of complicating infrastructure, just state the software versions you are running in your publications. In python, <strong>pip freeze</strong> helps. Want an older version ? DIY.</li>
<li>Use <a href="http://stanford.edu/~pgbovine/cde.html">CDE</a>, <a href="http://rbenv.org/">rbenv</a> and <a href="http://pypi.python.org/pypi/virtualenv">virtualenv</a> before and after publishing if concerned about platform updates during your research.</li>
<li>Use <a href="http://cloudbiolinux.org/">virtual machine images</a> to reproduce experiments.</li>
<li>If you really need the module system, at least <a href="https://github.com/scilifelab/modules.sf.net">publish the modules somewhere</a> for people to reuse them.</li>
<li>If you are a sysadmin, get started with <a href="http://www.semicomplete.com/blog/tags/deb">FPM</a> as a first approach with the world of package management for your distribution.</li>
<li>Try to get those packages accepted upstream (<strong>best!</strong>) and/or create your own rpm/deb <a href="http://www.howtoforge.com/creating_a_local_yum_repository_centos">repo</a>.</li>
<li>Learn <a href="http://puppetlabs.com/">puppet</a> and/or <a href="http://www.opscode.com/chef/">chef</a>.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://blogs.nopcode.org/brainstorm/2011/11/23/module-system-bad-and-ugly/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Galaxy on UPPMAX, simplified</title>
		<link>http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/</link>
		<comments>http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/#comments</comments>
		<pubDate>Mon, 22 Aug 2011 14:36:28 +0000</pubDate>
		<dc:creator>brainstorm</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bio]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[uppnex]]></category>

		<guid isPermaLink="false">http://blogs.nopcode.org/brainstorm/?p=550</guid>
		<description><![CDATA[This post is intended to be shortened over time, eventually becoming an automated procedure&#8230; a wiki-post from dahlo&#8217;s magic until upstream patches settle down. All commands are issued on the cluster, unless otherwise stated. Please report any issues via comments ! Firsly, follow my earlier post on how to setup your own python virtual environment [...]]]></description>
				<content:encoded><![CDATA[<p>This post is intended to be shortened over time, eventually becoming an automated procedure&#8230; a wiki-post from <a href="http://mdahlo.blogspot.com/2011/06/galaxy-on-uppmax.html">dahlo&#8217;s magic</a> until upstream patches settle down. All commands are issued on the cluster, unless otherwise stated.</p>
<p>Please report any issues via comments !</p>
<ol>
<li>Firsly, follow my earlier <a href="http://blogs.nopcode.org/brainstorm/2011/06/23/how-to-install-python-modules-with-virtualenv-on-uppmax/" title="How to install python modules with VirtualEnv… on UPPMAX">post</a> on how to setup your own python virtual environment on UPPMAX.</li>
<li>Once you have a prompt similar to: <em>(devel) hostname ~$</em>, you can continue, else, jump to 1.</li>
<li>pip install drmaa Mercurial PyYAML</li>
<li>Add the following env variables to your .bashrc:
<pre>
export DRMAA_LIBRARY_PATH=/bubo/sw/apps/build/slurm-drmaa/lib/libdrmaa.so
export DRMAA_PATH=$DRMAA_LIBRARY_PATH
</pre>
</li>
<li>Create a file ~/.slurm_drmaa.conf with the contents:
<pre>
job_categories: {
      default: "-A &lt;your project_account&gt; -p devel"
}
</pre>
</li>
<li>hg clone http://bitbucket.org/brainstorm/galaxy-central</li>
<li>Edit universe_wsgi.ini from the provided sample so that it contains:
<pre>
admin_users = &lt;your_admin_user&gt;@example.com
enable_api = True
start_job_runners = drmaa
default_cluster_job_runner = drmaa://-A &lt;your project_account&gt; -p devel
</pre>
</li>
<li>On your local machine: <em>ssh -f &lt;your_user&gt;@&lt;uppmax&gt; -L 8080:localhost:8080 -N</em></li>
<li>On your local machine: Fire up your browser and connect to http://localhost:8080</li>
</ol>
<p>As a betatester you may expect some issues when running galaxy in that way. Firstly, keep in mind that it&#8217;ll not perform as fast as a <a href="http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Production%20Server" title="Running galaxy on production">production-quality setup</a>, it&#8217;s just a developer instance. Furthermore the node you&#8217;re in might have time limit restrictions, meaning that your instance will be killed in 30 minutes if you don&#8217;t reserve a slot beforehand as <a href="http://mdahlo.blogspot.com/2011/06/galaxy-on-uppmax.html" title="Galaxy on UPPMAX by Martin Dahlo">Martin</a> recommended on the section &#8220;Run galaxy on a node&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced
Database Caching using disk
Object Caching 516/599 objects using disk

 Served from: blogs.nopcode.org @ 2013-05-26 08:07:23 by W3 Total Cache -->