<?xml version="1.0" encoding="utf-8"?>
<!-- If you are running a bot please visit this policy page outlining rules you must respect. http://www.livejournal.com/bots/ -->
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:lj="http://www.livejournal.com">
  <id>urn:lj:livejournal.com:atom1:illiterat</id>
  <title>James Antill</title>
  <subtitle>James Antill</subtitle>
  <author>
    <name>James Antill</name>
  </author>
  <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/"/>
  <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom"/>
  <updated>2011-09-26T19:05:14Z</updated>
  <lj:journal userid="1038590" username="illiterat" type="personal"/>
  <link rel="service.feed" type="application/x.atom+xml" href="http://illiterat.livejournal.com/data/atom" title="James Antill"/>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:8485</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/8485.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=8485"/>
    <title>XML grep</title>
    <published>2011-09-26T19:05:14Z</published>
    <updated>2011-09-26T19:05:14Z</updated>
    <category term="xml"/>
    <category term="grep"/>
    <content type="html">For a long time I've wanted to be able to do an "XML grep" on XML data, where I see just the nodes I care about in an XML file. After recently hitting this problem again, I found out about xmlstarlet and after a _lot_ of work I managed to get what I wanted. So I figured I'd write it down, for both of us:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;
xmlstarlet sel -I -t \
  -m '/updates/update' \
  -i 'pkglist/collection/package[@name="raydium"]' \
  -c . \
  /var/cache/yum/x86_64/15/updates/gen/updateinfo.xml
&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;...I'll explain each line of the above:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;{xmlstarlet sel -I -t} -- Runs the command, in "select" mode and does automatic indentation of the output.&lt;/li&gt;&lt;br /&gt; &lt;br /&gt;&lt;li&gt;{-m '/updates/update'} -- Where we are matching "from", comparing this to grep we are kind of saying that each node at &amp;lt;updates&amp;gt;&amp;lt;update&amp;gt; is a line. This is an XPATH expression.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;{-i 'pkglist/collection/package[@name="raydium"]'} -- The condition we want to match on, here we are saying that we want to match on anything of the form &amp;lt;updates&amp;gt;&amp;lt;update&amp;gt;&amp;lt;pkglist&amp;gt;&amp;lt;collection&amp;gt;&amp;lt;package name="raydium"&amp;gt;. This is an XPATH expression (it's worth repeating the @ changes the match from the node to the attribute).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;  {-c .} -- This says display all the nodes from the start of our match #2, that pass our condition.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;  { /var/cache/yum/x86_64/15/updates/gen/updateinfo.xml } -- The XML file to operate on (if it's missing it default to stdin.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:8221</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/8221.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=8221"/>
    <title>A few things you might not know about RHEL-6.1+ yum</title>
    <published>2011-05-19T18:02:04Z</published>
    <updated>2011-05-20T21:42:39Z</updated>
    <category term="package management"/>
    <category term="yum"/>
    <category term="rhel"/>
    <content type="html">&lt;h1&gt;Time to look at a few features of yum in &lt;a href="http://www.redhat.com/about/news/prarchive/2011/Red-Hat-Delivers-Red-Hat-Enterprise-Linux-6-1" rel="nofollow"&gt;RHEL-6.1&lt;/a&gt; now that it's released&lt;/h1&gt;

&lt;ul&gt;
 &lt;li&gt;&lt;h3&gt;Search is more userfriendly&lt;/h3&gt; &lt;a name="cutid1"&gt;&lt;/a&gt;&lt;p&gt; As we maintain yum we are always looking for the "minor" changes that can make a big difference to the user, and this is probably one of the biggest minor changes. As of late RHEL-5 and RHEL-6.0 "yum search" was great for finding obscure things that you knew something about but with 6.1 we've hopefully made it useful for finding the "everyday" packages you can't remember the exact name of. We did this by excluding a lot of the "extra" hits, when you get a large search result. For instance "yum search kvm manager" is pretty useless in RHEL-6.0, but in RHEL-6.1 you should find what you want very quickly.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
yum search kvm manager
yum search python url
&lt;/pre&gt;
&lt;a name='cutid1-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;The updateinfo command &lt;/h3&gt;&lt;a name="cutid2"&gt;&lt;/a&gt; &lt;p&gt; The "yum-security" or "yum-plugin-security" package has been around since early RHEL-5, but the RHEL-6.1 update has introduced the "updateinfo" command to make things a little easier to use, and you can now easily view &lt;b&gt;installed&lt;/b&gt; security errata (to more easily make sure you are secure). We've also added a few new pieces of data to the RHEL updateinfo data. Probably the most significant is that as well as errata being marked "security" or not they are now tagged with their "severity". So you can automatically apply only "critical" security updates, for example.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
yum updateinfo list security all
yum update-minimal --sec-severity=critical
&lt;/pre&gt;
&lt;a name='cutid2-end'&gt;&lt;/a&gt; &lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;The versionlock command &lt;/h3&gt;&lt;a name="cutid3"&gt;&lt;/a&gt; &lt;p&gt; As with the previous point we've had "yum-plugin-version" for a long time, but now we've made it easier to use and put all it's functions under a single "versionlock" sub-command. You can now also "exclude" specific versions you don't want, instead of locking to known good specific ones you had tested.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
# Lock to the version of yum currently installed.
yum versionlock add yum
# Opposite, disallow versions of yum currently available:
yum versionlock exclude yum

yum versionlock list
yum versionlock delete yum\*
yum versionlock clear

# This will show how many "excluded" packages are in each repo.
yum repolist -x .
&lt;/pre&gt;
&lt;a name='cutid3-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;Manage your own .repo variables &lt;/h3&gt; &lt;a name="cutid4"&gt;&lt;/a&gt;&lt;p&gt; This is actually available in RHEL-6.0, but given that almost nobody knows about it I thought I'd share it here. You can put files in "/etc/yum/vars" and then use the names of those files are variables in any yum configuration, just like $basearch or $releasever. There is also a special $uuid variable, so you can track individual machines if you want to.
 &lt;/p&gt;&lt;a name='cutid4-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;yum has it's own &lt;a href="http://yum.baseurl.org/wiki/YumDB" rel="nofollow"&gt;DB&lt;/a&gt; &lt;/h3&gt; &lt;a name="cutid5"&gt;&lt;/a&gt;&lt;p&gt; Again, this something that was there in RHEL-6.0 but has improved (and is likely to improve more over time). The most noticeable addition is that we now store the "installed_by" and "changed_by" attributes, this could be worked out from "yum history" before, but now it's easily available directly from the installed package.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
yumdb
yumdb info yum
yumdb set installonly keep kernel-2.6.32-71.7.1.el6
yumdb sync
&lt;/pre&gt;
&lt;a name='cutid5-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;Additional data in "yum history" &lt;/h3&gt; &lt;a name="cutid6"&gt;&lt;/a&gt;&lt;p&gt; Again, this something that was there in RHEL-6.0 but has improved (and is likely to improve more over time). The most noticeable additions are that we now store the command line and we store a "transaction file" that you can use on other machines.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
yum history
yum history pkgs yum
yum history summary

yum history undo last

yum history addon-info 1    config-main
yum history addon-info last saved_tx
&lt;/pre&gt;
&lt;a name='cutid6-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;"yum install" is now fully kickstart compatible&lt;/h3&gt; &lt;a name="cutid7"&gt;&lt;/a&gt;&lt;p&gt; As of RHEL-6.0 there was one thing you could do in a kickstart package list that you couldn't do in "yum install" and that was to "remove" packages with "-package". As of the RHEL-6.1 yum you can do that, and we also added that functionality to upgrade/downgrade/remove. Apart from anything else, this should make it very easy to turn the kickstart package list into "yum shell" files (which can even be run in kickstart's %post).
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
 yum install 'config(postfix) &amp;gt;= 2.7.0'
 yum install MTA
 yum install '/usr/kerberos/sbin/*'
 yum -- install @books -javanotes
&lt;/pre&gt;
&lt;a name='cutid7-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;Easier to change yum configuration &lt;/h3&gt; &lt;a name="cutid8"&gt;&lt;/a&gt;&lt;p&gt; We tended to get a lot of feature requests for a plugin to add a command line option so the user could change a single yum.conf variable, and we had to evaluate those requests for general distribution based on how much we thought all users would want/need them. With the RHEL-6.1 yum we created the --setopt so that any option can be changed easily, without having to create a specific bit of code. There were also some updates to the yum-config-manager command.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
yum --setopt=alwaysprompt=false upgrade yum

yum-config-manager
yum-config-manager --enable myrepo
yum-config-manager --add-repo https://example.com/myrepo.repo
&lt;/pre&gt;
&lt;a name='cutid8-end'&gt;&lt;/a&gt;&lt;/li&gt;

 &lt;li&gt; &lt;h3&gt;Working towards managing 10 machines easily &lt;/h3&gt; &lt;a name="cutid9"&gt;&lt;/a&gt;&lt;p&gt; yum is the best way to manage a single machine, but it isn't quite as good at managing 10 identical machines. While the RHEL-6.1 yum still isn't great at this, we've made a few improvements that should help significantly. The biggest is probably the "load-ts" command, and the infrastructure around it, which allows you to easily create a transaction on one machine, test it, and then "deploy" it to a number of other machines. This is done with checking on the yum side that the machines started from the same place (via. rpmdb versions), so that you know you are doing &lt;b&gt;the same operation&lt;/b&gt;.
 &lt;/p&gt;
 &lt;p&gt; Also worth noting is that we have added a plugin hook to the "package verify" operation, allowing things like "puppet" to hook into the verification process. A prototype of what that should allow those kinds of tools to do was written by Seth Vidal &lt;a href="http://skvidal.wordpress.com/2011/01/12/reverify/" rel="nofollow"&gt;here&lt;/a&gt;.
 &lt;/p&gt;
&lt;b&gt; Example commands:&lt;/b&gt;
&lt;pre&gt;
# Find the current rpmdb version for this machine (available in RHEL-6.0)
yum version nogroups

# Completely re-image a machine, or dump it's "package image"
yum-debug-dump
yum-debug-restore 
    --install-latest
    --ignore-arch
    --filter-types=install,remove,update,downgrade

# This is the easiest way to get a transaction file without modifying the rpmdb
echo | yum update blah
ls ${TMPDIR:-/tmp}/yum_save_tx-* | sort | tail -1

# You can now load a transaction and/or see the previous transaction from the history
yum load-ts /tmp/yum_save_tx-2011-01-17-01-00ToIFXK.yumtx
yum -q history addon-info last saved_tx &amp;gt; my-yum-saved-tx.yumtx
&lt;/pre&gt;
&lt;a name='cutid9-end'&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:7944</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/7944.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=7944"/>
    <title>Yum "update" from F-11 to F-13, with some help from distro-sync</title>
    <published>2010-06-11T20:34:28Z</published>
    <updated>2010-06-11T20:34:28Z</updated>
    <category term="pacakging"/>
    <category term="yum"/>
    <category term="fedora"/>
    <content type="html">&lt;p&gt;Here is my experience of updating a machine from F-11 to F-13, on 2010-05-30, with plain yum over ssh. I got all the info. from "yum history", and had a little help from "distro-sync" which is only in rawhide's yum atm.

&lt;a name="cutid1"&gt;&lt;/a&gt;

&lt;h2&gt; The "simple" start. &lt;/h2&gt;
&lt;p&gt; First off I updated yum to the latest from rawhide "yum-3.2.27-13.fc14.noarch". Then I updated to the new release via. "yum --releasever=13 update fedora-release". Then I ran my first "yum distro-sync" to do the update, and hit depsolving problems.
&lt;/p&gt;

&lt;h2&gt; First problem. &lt;/h2&gt;

&lt;p&gt; The first problem was that libgnomedb-1:3.99.7-3.fc11.x86_64 had to be removed, as it didn't exist anymore in F-13 and was depending on things that needed to be upgraded. Then I had a similar problem with a bunch fc8 packages, that were also depending on things that were going away:

&lt;pre&gt;
    Erase        chkfontpath-1.10.1-2.fc8.x86_64
    Erase        fonts-arabic-2.1-2.fc8.noarch
    Erase        fonts-bengali-2.1.5-3.fc8.noarch
    Erase        fonts-gujarati-2.1.5-3.fc8.noarch
    Erase        fonts-hebrew-0.101-2.fc8.noarch
    Erase        fonts-hindi-2.1.5-3.fc8.noarch
    Erase        fonts-kannada-2.1.5-3.fc8.noarch
    Erase        fonts-malayalam-2.1.5-3.fc8.noarch
    Erase        fonts-oriya-2.1.5-3.fc8.noarch
    Erase        fonts-punjabi-2.1.5-3.fc8.noarch
    Erase        fonts-sinhala-0.2.2-3.fc8.noarch
    Erase        fonts-tamil-2.1.5-3.fc8.noarch
    Erase        fonts-telugu-2.1.5-3.fc8.noarch
    Erase        system-config-soundcard-2.0.6-11.fc8.noarch
&lt;/pre&gt;

&lt;p&gt;
Finally I had to erase cryptsetup-luks-1.0.6-7.fc11.i586, because that had stopped being multilib. and didn't have obsoletes. All of these were relatively painless, esp. with the new code in yum which tries to explain depsolving problems.
&lt;p&gt;

&lt;h2&gt; distro-sync &lt;/h2&gt;

&lt;p&gt; Then I ran distro-sync (excluding yum), and let it do it's thing. This is a new command in yum, and means you don't have to worry about bugs in NEVR packaging (because it will downgrade packages to the latest available). These are the packages it "fixed" for me:
&lt;/p&gt;

&lt;pre&gt;
    Downgrade    compat-db45-4.5.20-2.fc13.x86_64
    Downgraded               4.5.20-5.fc10.x86_64
    Downgrade    nss-softokn-freebl-3.12.4-19.fc13.x86_64
    Downgraded                      3.12.6-1.2.fc11.x86_64
    Downgrade    preupgrade-1.1.4-1.fc13.noarch
    Downgraded              1.1.5-1.fc11.noarch
&lt;/pre&gt;

&lt;p&gt; The compat-db45 is esp. amusing, as that had probably been broken all through fc11. &lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:7834</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/7834.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=7834"/>
    <title>Explaining: "Warning: RPMDB altered outside of yum."</title>
    <published>2010-01-22T22:04:57Z</published>
    <updated>2010-01-22T22:09:07Z</updated>
    <category term="centos"/>
    <category term="package management"/>
    <category term="yum"/>
    <category term="rhel"/>
    <category term="fedora"/>
    <content type="html">&lt;h1&gt;What does it mean?&lt;/h1&gt;

&lt;p&gt; The yum message "Warning: RPMDB altered outside of yum." or, as the yum message said for a few months, "Warning: RPMDB has been altered since the last yum transaction." means some application has altered the rpm database (installed or removed a package) without going through the Yum APIs. This is almost always due to someone using rpm directly (Ie. rpm -ivh blah.rpm), but another possibility is an application built on top of the rpm APIs (Ie. smart, apt, zypp). While it's &lt;b&gt;possible&lt;/b&gt; that someone has hacked your machine and altered the rpmdb maliciously, it would have to be done poorly to trigger this warning.&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;

&lt;h1&gt;Why has yum started to emit this warning?&lt;/h1&gt;

&lt;p&gt;There are three main sets of reasoning behind bringing this to the users attention.&lt;/p&gt;

&lt;h2&gt;New yum features require yum being "the" packaging API&lt;/h2&gt;

&lt;p&gt; There are now a few features in yum, requested by users of the package management system, that require yum is aware of all package actions on the system. Here a few of the current ones:
&lt;/p&gt;
&lt;od&gt;

&lt;li&gt; The most obvious example is "yum history", which records when packages were installed, when and by whom. If yum is not involved in installing/updating/removing/etc. some packages then a lot of the benefits of "yum history" are gone. For instance there is no useful audit trail anymore, you can't use "yum history list blah" and know you have all the instances where something happened to "blah".&lt;/li&gt;

&lt;li&gt; Yum now has it's own database, for package information it wants to record but has no corresponding entry in the rpmdb, the obvious example is "the id of the repository that this package was installed from" but there are quite a few pieces of info. now.&lt;/li&gt;

&lt;li&gt; Following on from the previous point, rpmdb versions are a significant feature for managing many machines by yum. They require information from the yumdb, so installing something via. yum on one machine but via. rpm on another would give the machines different "rpmdb versions".&lt;/li&gt;

&lt;/od&gt;

&lt;p&gt;This is not a complete list, and as more package management features are implemented they are much more likely to be implemented at the yum layer than at the rpm layer. Not because rpm is bad, but for the same reasons that the above features were implemented in yum, it's much easier and faster to implement them there.&lt;/p&gt;

&lt;h2&gt;Rpm is often abused, when used directly&lt;/h2&gt;

&lt;p&gt; We, the yum developers, often find that people using rpm directly only do so to solve problems in a way that just creates more problems (and often looks like yum is at fault). For instance using any of --force, --nodeps, --justdb is pretty much outright lying to yum, and is pretty much guaranteed to confuse yum vs. rpm. For instance over the last 3 months, over 10% of the bugs we've had against yum have been due to this kind of problem (and probably very close to 100% of bugs about "rpm doesn't like the depsolving solution yum did").&lt;/p&gt;

&lt;p&gt; There is also the problem of users accidentally using "rpm --install" which is very different from what yum thinks of as an "install" operation. Which, again, just creates problems and even if yum works around these it is unlikely to do so in a way the user wished. &lt;/p&gt;

&lt;p&gt; In addition to the problems it creates there are no known reasons to use rpm instead of yum. Over the last year or so yum has added features like the downgrade and reinstall commands (which obsolete using rpm --oldpackage) and are generally much easier for the user (in this case esp. combined with yum history undo/redo). Recent versions of yum will even allow you to do things like "yum install http://example.com/blah.rpm" which was the last use case where you could easily do something with rpm and not with yum (and even then we also changed yum so we could add the --releasever, which is a much easier solution to the common reason why people want http installs). But if you can think of something that is easier to do with rpm, we'll almost certainly add a feature to yum to solve that problem. &lt;/p&gt;

&lt;h2&gt;We do extra work, and so inform the user&lt;/h2&gt;

&lt;p&gt; We now do a non-trivial amount of work when we detect that something has happened to the rpmdb, including a full "yum check" run, so that yum can show problems as close to their source as possible. Just stopping yum for a significant amount of time, or outputting problem reports, with no context would not be a good way to interact with the user. We are also able to cache rpm data, and thus. have yum run faster, when yum is aware of all rpmdb operations (so again, we'd want some way to let the user know: yum is running slower, and this is why).&lt;/p&gt;

&lt;h2&gt;What can I do to make it stop?&lt;/h2&gt;

&lt;p&gt; The most obvious solution is the one we recommend: Only use yum, or tools that use yum APIs, to install/update/remove/etc. packages on your system. If you don't like yum, and prefer to use some other packaging system that's fine but we'd assume you'd want to use that system all the time (and thus. it doesn't matter what yum does or doesn't do, as you won't be using it).
&lt;/p&gt;

&lt;p&gt; If you think you really need to use yum some/most of the time but something else the rest of the time, then there is a way to currently turn all this checking off. In your yum.conf set "history_record = false", and yum will not record any history (the "yum history" command will be useless) but yum will also not be able to tell when the rpmdb has changed. This will break other features, like the yumdb, but it will stop the warnings.
&lt;/p&gt;

&lt;h1&gt;Known problems&lt;/h1&gt;

&lt;p&gt; Old versions of the "remove-with-leaves" yum plugin called an API which would corrupt the state yum thought the rpmdb was in. Older versions of the "keys" plugin may also have been affected. This meant that the warning was produced all the time, and yum history would not be happy. Newer versions of yum fixed the APIs affected. &lt;/p&gt;

&lt;p&gt; The &lt;a href="http://fedorasolved.org/Members/zcat/akmods" rel="nofollow"&gt;akmod&lt;/a&gt; packages seem to work by calling rpm directly, and thus. operating without yum's knowledge, they need to be fixed. 
&lt;/p&gt;
&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:7660</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/7660.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=7660"/>
    <title>Multiple problems with multiple versions</title>
    <published>2009-10-09T13:42:21Z</published>
    <updated>2009-10-16T21:23:10Z</updated>
    <category term="python"/>
    <category term="yum"/>
    <category term="fedora"/>
    <content type="html">&lt;p&gt; I've now seen a few requests for yum which can be generalized as "work better with multiple versions of packages". So I decided to write this, instead of replying N times to all the different requests.
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h2&gt;The simple feature: just do what rpm does&lt;/h2&gt;

&lt;p&gt; Since it's inception rpm differentiated between an "install" and an "update", allowing you to install X-1-1 and X-2-1 as long as neither package provided the same filename as the other with different content. At the rpm layer this is mostly an easy problem to solve, if the user wants to install instead of update you just check all the files in both packages and as long as there isn't a conflict you install the new package but don't remove the old one.
&lt;/p&gt;

&lt;p&gt; As I said this mostly works well, you can update some packages and install/remove the other. However one problem you'll quickly run into is that if you have both X-1-1 and X-2-1 installed, but these now need to be updated by X-1-2 and X-2-2. But there's no easy way to do this, as there's no good way to tell rpm "I update just X-1-* (or X-2-*) packages". This is a core problem for yum, if you have multiple versions of a package installed what does "yum update" do.
&lt;/p&gt;


&lt;h2&gt;Kernel: The special case&lt;/h2&gt;

&lt;p&gt; The kernel is treated differently from every other package, mainly because you can't move to the new version without a reboot, the running version needs to be able to access it's modules and if you only have one version and it breaks the machine doesn't boot anymore. So yum has infrastructure setup to install multiple versions of packages, which is used/tested just by the kernel (although you can give that ability to any package).
&lt;/p&gt;

&lt;p&gt; However the kernel has a few positive features that are in it's favour. The two main ones being that each kernel package has files that are 100% unique to it and that no other package "uses" the data within the kernel package (so nothing needs to answer the question "which of the N versions available should I use").
&lt;/p&gt;

&lt;p&gt; This works fine for the kernel, and has an obvious answer to the question "what does yum update do" ... if there is a newer version of the package than the one installed, we install it, and if the installonly_limit has been reached for this package &lt;b&gt;we remove the oldest one&lt;/b&gt; (that isn't the running kernel, and hasn't been manually marked to be kept in the yumdb).
&lt;/p&gt;

&lt;p&gt; This simple policy wouldn't work for anything other packages though (like the example above), so we either need to change this policy (and make sure the kernel package keeps working) or try something else...
&lt;/p&gt;


&lt;h2&gt;Embed the version in the package name&lt;/h2&gt;

&lt;p&gt; This is the next obvious solution, and is used quite frequently. You still need to make sure neither package has files which conflict with the other, and there are usability problems to do with the fact the package X is now called X1 and/or X2. But this, at least, &lt;b&gt;can&lt;/b&gt; work and allows you to have X1-1-1 be upgraded by X1-1-2 and X2-2-1 be upgraded by X2-2-2. &lt;/p&gt;

&lt;p&gt; The problems with doing this usually stem from the fact the packages are not isolated, and they will often provide some of the same things (so even though their names are different, they can be referred to by a common provide name). This is esp. a problem with rpm autoprovides. It then becomes problematic to test, because instead of one test case (install package X) to see if it works, you now have three testcases (install X1, install X2 and install X1 and X2). This testing complexity rises as you have more versioned packages, X3 is 7, X4 is 15! &lt;/p&gt;

&lt;p&gt; The good thing about this method (from a yum developers point of view) is that nothing needs to be done on the package manager side. Each package is completely independent, so if normal packages install then versioned packages install. &lt;/p&gt;

&lt;h3&gt;Taking a look an example, python2 and python3 packages&lt;/h3&gt;

&lt;p&gt; For a concrete example consider having python2 and python3 (as of 2009 that is desirable because of the backwards incompatibility snafu with python3), we can "easily" create a python3 package that is installable alongside the current default of python (== python-2.6/python-2.7/whatever). There are some auto-provides, but we could either remove/change them on the python3 package or make sure nothing else is using them. Users can install both and run either /usr/bin/python or /usr/bin/python3, all seems good.
&lt;/p&gt;

&lt;p&gt; If this was enough, you can now rest ... until the first time some random package decides it'd like to use python3 instead of python, and you get a huge number of complaints that users now have two versions of python wasting disk/bandwidth/etc.
&lt;/p&gt;

&lt;p&gt; However core python is very rarely enough, for any GUI application you'll need python3 bindings to gtk and for anything a bit more complex (like yum) you'll need bindings to lots of C APIs (rpm, xml, gpg and curl to name a few of yum's requirements). This means you've gone to a lot of work for something that isn't very useful, or you have to create versioned packages for a significant part of the packages that touch python ... and then you have to test that (which, remember, is 3 times the work).&lt;/p&gt;

&lt;p&gt; And you have to maintain that, testing updates and doing security errata (3x the work), and deal with bugs/complaints from people saying that package FOO is only available in version A and they need it in the other version ... or logging bugs against python-blah instead of python-blah3. Of course humans are lazy, so to get around doing some of this work someone will decide it's a good idea to have a single python-blah package that works with either/both versions of python ... and then you'll have more problems.
&lt;/p&gt;

&lt;p&gt;
And then at the end you still have the usability problems where the user installs "python-blah" but really wanted to install "python3-blah" not understanding why things don't work.
&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:7412</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/7412.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=7412"/>
    <title>Lies, damn lies, and benchmarks</title>
    <published>2009-07-31T19:11:15Z</published>
    <updated>2010-01-22T22:48:52Z</updated>
    <category term="benchmarking"/>
    <category term="yum"/>
    <content type="html">&lt;h1&gt;Or why I hate "quick benchmarks"...&lt;/h1&gt;

&lt;p&gt;Recently I've started to see a lot more of what I'd call "quick benchmarks", often it's to do with &lt;a href="http://yum.baseurl.org/" rel="nofollow"&gt;yum&lt;/a&gt; but mainly I think that's because those tend to get sent to me by multiple methods. So I thought I'd try and write down why I react so negatively/dismissively to them, how people can spot the underlying problems that annoy me and even better some advise on how you can go about doing some real benchmarks if that kind of thing interests you (but it's &lt;b&gt;much&lt;/b&gt; more work than quick benchmarks). &lt;/p&gt;

&lt;p&gt;The summary of the problem is that quick software benchmarking often involves taking a huge amount of differences between two applications and have a single number result. Then you compare just the numbers, and come to a conclusion. So X gets 3 and Y gets 5 for problem ABCD ... therefore Y is 66% worse than X at ABCD. Except that might be a highly misleading (or worse) conclusion, for a number of reasons:&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h2&gt;Benchmarking Bias&lt;/h2&gt;

&lt;p&gt; This is often the route cause of all problems in quick benchmarks, and because of it's pervasiveness in human nature I would say that a significant portion of the work in doing a good benchmark is making sure you've tried to reduce the effect of your own bias. The bias happens in many forms, from the simple fact that when testing X and Y you might have a deeper understanding of one and so test it's strengths; make less configuration errors; or even assume favourable results are correct but unfavourable ones are incorrect.
&lt;/p&gt;

&lt;h3&gt;An old example&lt;/h3&gt;

&lt;p&gt; Probably the first experience I had of this was at the end of the 1990s, I can find a link to the original "discussions" now but I can relate the main points. Linux had supported SMP (running on 2 or more CPUs at once) for a couple of years and FreeBSD was just starting to look at it seriously. While doing the Linux work the kernel developers had created a simple benchmark of rebuilding the kernel with different "make -j" configurations (controlling how parallel make), probably because it's simple tests a bunch of things at once and affects kernel developers personally. Obviously it was natural for the FreeBSD developers to do the same kind of thing, when they were doing the same kinds of work. I found what I think is the original FreeBSD report that someone posted on this &lt;a href="http://people.freebsd.org/~fsmp/SMP/akgraph-a/graph1.html" rel="nofollow"&gt;FreeBSD SMP - kernel compilation bench&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt; Note that both of the above benchmarks were meant to be tested against themselves (ie. the developer would run the benchmark, make some changes and then run the benchmark again on the new kernel with the changes and see the difference). And, again, they were great for that ... pretty much all developers have similar kinds of test runs that they run. However someone then decided to take "the result for Linux" and compare it to "the result for FreeBSD", and came to the conclusion that because the FreeBSD number was smaller than the Linux number then that meant "FreeBSD was already faster at SMP than Linux". This conclusion was repeated by a number of knowledgable FreeBSD developers before a Linux developer pointed out the obvious fact that the tests didn't just have differenet kernels, but different source trees were being compiled with different compilation toolchains.
&lt;/p&gt;

&lt;p&gt; Again, I don't think the FreeBSD developers intentionally meant to create or use bogus data just that given some random numbers implying that FreeBSD was better than Linux they were biased to believe them and so didn't think about it too much.
&lt;/p&gt;

&lt;h3&gt;A couple of yum examples&lt;/h3&gt;

&lt;p&gt; I've seen a lot of people test "yum makecache" vs. "apt-get update" or "smart update" or "zypper refresh". This makes some sense from the point of view of an apt developer/user because this is an operation that the user has to run before any set of operations to make sure the database is upto date, so any improvement (relative to older versions of your self) is a significant win. However a yum user is unlikely to ever run this command because the default mode of operation is for yum to manage database synchronization.
&lt;/p&gt;

&lt;p&gt; Another common problem is to compare something like "apt-cache search" / "apt-file search" against "yum search" / "yum provides", the problem here is that apt-cache / apt-files just do a simple grep on the cache of available packages ... yum does it's searches against what yum can see use (so, for example, versionlock'ing a package will affect the results you get in yum ... as will installing packages).
&lt;/p&gt;

&lt;p&gt;
 These problems should be obvious after even a moments thought about how yum is used, but again most of the people publishing this kind of results either don't use yum or are expecting it to be slower and so don't think about results which confirm that expectation.
&lt;/p&gt;

&lt;h2&gt;Measuring the slow thing&lt;/h2&gt;

&lt;p&gt; Many developers are aware that their applications have a "fast path" and a "slow path", which normally align with "unexpected cases work, but are slow" and "expected, normal, cases work quickly". Benchmarkers often do not know these distinctions. I've seen numerous cases where someone benchmarks something that wouldn't be done in real life. Alas. this is hard to spot for the normal user, and is one of the reasons that it's wise to speak with the developers of whatever you are benchmarking. &lt;/p&gt;

&lt;h2&gt;Benchmarking 1 point and then concluding about N&lt;/h2&gt;

&lt;h3&gt;The obvious yum example&lt;/h3&gt;

&lt;p&gt; By far the most common problem here is people run a simple benchmark like "time (echo n | yum update)" compared to the same operation in apt or zypp and then concluding X is x% faster at updating than Y. The problem here is that updating actually has a number of operations (breaking them down roughly):&lt;/p&gt;

&lt;ol&gt;
 &lt;li&gt; read repo. configuration, cmd line options etc. &lt;/li&gt;
 &lt;li&gt; check if repo. metadata is current. &lt;/li&gt;
 &lt;li&gt; download repo. metadata if not current. &lt;/li&gt;
 &lt;li&gt; merge configured repo. metadata into "available". &lt;/li&gt;
 &lt;li&gt; read current rpm metadata. &lt;/li&gt;
 &lt;li&gt; depsolve updates+obsoletes+etc. from rpm and repo. metadata. &lt;/li&gt;
 &lt;li&gt; output changes. &lt;/li&gt;
 &lt;li&gt; confirm changes. &lt;/li&gt;
 &lt;li&gt; download updates. &lt;/li&gt;
 &lt;li&gt; install updates. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt; However even though there are 10 operations above, the "simple update benchmark" only really tests "6. depsolve ...". Which is fine on it's own, it's worth knowing what the depsolve time is (and I run this benchmark myself with yum to find out that info.) but depsolve time is not the same as update time. In fact most of the time it's in the noise for the update operation as a whole (obviously saying your update takes 25% of the time is much more fun than saying it takes 98% of the time).&lt;/p&gt;

&lt;p&gt; I've even seen problems here where they'll test apt on Debian vs. yum on Fedora, which much like the FreeBSD vs. Linux test changes so many different things that you can't conclude anything about just the package managers. There have also been cases where someone tested "apt upgrade" vs. "yum upgrade" but the yum operation also download repo. metadata, as it wasn't current/available.
&lt;/p&gt;

&lt;h2&gt;Benchmarking N points and then concluding about 1&lt;/h2&gt;

&lt;h3&gt;The FreeBSD vs. Linux point&lt;/h3&gt;

&lt;p&gt;As I said above the technical problem with the FreeBSD make kernel vs. Linux make kernel problem was that it was testing 3 different sets of things (buildtools, source code, kernel) and drawing a conclusion about only 1 of them.&lt;/p&gt;

&lt;h3&gt;Phoronix has many examples here&lt;/h3&gt;

&lt;p&gt; Phoronix has many examples of all the different problems you can get into here. From things like their "apache benchmark", which turned out to have nothing to do with Apache on either platform they tested it on, and mp3 encoding "filesystem benchmarks". To their usual "conclusions" which take 10-20 completely separate points, with completely separate error rates, and come up with an "average".
&lt;/p&gt;

&lt;h3&gt;The obvious yum example&lt;/h3&gt;

&lt;p&gt; The usual way this happens in package management "benchmarking" is that a random number of commands will be tested, like install; update; remove; whatprovides; search; list ... and then a final score will be given. This completely ignores the fact some of those commands are being run much more often than others, and in different situations. &lt;/p&gt;

&lt;h2&gt;Standard deviation is always significant&lt;/h2&gt;

&lt;p&gt; The simple way to think of this is that multiple runs of anything you test need to be performed, and if you don't have results where the answer is "X and Y perform the same" (even though the absolute numbers are never going to be identical) ... the benchmarking has probably screwed up the standard deviation.

&lt;h3&gt;Phoronix has many examples here&lt;/h3&gt;

&lt;p&gt; This is often combined with the previous problem, if you test X vs. Y and get the results of 1000 and 1200 then there's a huge difference between a SD (&lt;a href="http://en.wikipedia.org/wiki/Standard_deviation" rel="nofollow"&gt;standard deviation&lt;/a&gt;) of 500 and an SD of 1 on those results. And this is esp. important when you have another result which is 10 vs. 20 and an SD that could be 0.1 or 5.
&lt;/p&gt;

&lt;h3&gt;The obligatory yum point&lt;/h3&gt;

&lt;p&gt; I never see benchmarks of package management operations that include a standard deviation value, and this can be esp. important when different values and/or different configurations can affect the performance significantly.
&lt;/p&gt;

&lt;h1&gt;What to do, if you want to create a real benchmark&lt;/h1&gt;

&lt;p&gt;All of the above is not to try to discourage real benchmarking, as this is a time consuming and unrewarding task (IMO), but often bad benchmarking is worse than none at all. If you aren't up for that then feel free to suggest usecases that seem difficult/slow/etc., even if you need to explain it as "currently I run X like this, and it takes N time, how do I do something similar with Y in a similar amount of time"  ... this is not the same thing, because the answer might well be "run Y like this instead". But if you are up for the challenge then...&lt;/p&gt;

&lt;p&gt; As I said earlier, if you are benchmarking X and Y and comparing them then you need to be very familiar with both X and Y, which means investing a significant amount of time working with both X and Y. This very likely means that you want to speak to developers of both X and Y, both about the measurements you are trying to take and the results you are getting. I am always very suspicious if the benchmarking seems to be one sided (in that more knowledge was available about X and Y, at the time of testing), or more to the point I'm very suspicious if you are benchmarking yum but I've no idea who you are.
&lt;/p&gt;

&lt;p&gt; Also, even if you create a good benchmark which shows X is 999x faster than Y for operation FOO, and FOO is a useful feature that people will want to take advantage of. It is likely that the developers for Y will be able to change Y (sometimes trivially, as it might be an assumed edge case or regression noone had hit, yet) which reduces the difference significantly. This also often means that doing benchmarks properly "just" makes both X and Y better, with the published results showing they are within 5% or whatever and so "nobody cares" about the benchmarks. Which makes running the benchmarks pretty unrewarding, but hey you want to be the benchmarker not me... :)
&lt;/p&gt;
&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:7125</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/7125.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=7125"/>
    <title>BugZilla feature of the year -- recent (commented) history</title>
    <published>2009-04-29T22:38:09Z</published>
    <updated>2010-01-22T22:49:15Z</updated>
    <category term="bugzilla"/>
    <category term="tips"/>
    <category term="fedora"/>
    <content type="html">&lt;p&gt; Something I've wanted for a long time is "show me the &lt;a href="http://bugzilla.redhat.com" rel="nofollow"&gt;BugZilla&lt;/a&gt; tickets which I've looked at "recently" (last day, week, etc.). Like almost everyone else I speak to I get by saving all my bugzilla email and then doing seartches in evolution.&lt;/p&gt;

&lt;p&gt; However today &lt;lj-user&gt;ajaxxx&lt;/lj-user&gt; found/used advanced bugzilla search options to give a page which has the recent most history of comments you've done. This isn't quite the same as being able to see everything that you've looked at, but it's pretty close.
&lt;/p&gt;

&lt;h1&gt; The url and the saved search &lt;/h1&gt;

&lt;p&gt; The search works due to two "advanced features: 1. Selecting yourself as the commentator: &amp;lt;&lt;b&gt;field0-1-0=commenter&lt;/b&gt;&amp;gt;, &amp;lt;&lt;b&gt;type0-1-0=equals&lt;/b&gt;&amp;gt;, and &amp;lt;&lt;b&gt;value0-1-0=%user%&lt;/b&gt;&amp;gt; and 2. Selecting comment changes that happened after 8 days ago &amp;lt;&lt;b&gt;field0-0-0=longdesc&lt;/b&gt;&amp;gt;, &amp;lt;&lt;b&gt;type0-0-0=changedafter&lt;/b&gt;&amp;gt;, and &amp;lt;&lt;b&gt;value0-0-0=8d&lt;/b&gt;&amp;gt;. The "8d" part can obviously be changed to reflect different history. To experience the feature a  full working URL is: &lt;a href="https://bugzilla.redhat.com/buglist.cgi?query_format=advanced&amp;amp;field0-0-0=longdesc&amp;amp;type0-0-0=changedafter&amp;amp;value0-0-0=8d&amp;amp;field0-1-0=commenter&amp;amp;type0-1-0=equals&amp;amp;value0-1-0=%25user%25" rel="nofollow"&gt;here&lt;/a&gt;
&lt;/p&gt; 

&lt;p&gt; I also created a saved search called &lt;a href="https://bugzilla.redhat.com/buglist.cgi?cmdtype=dorem&amp;amp;remaction=run&amp;amp;namedcmd=Commented%20in%20last%208%20days&amp;amp;sharer_id=73713" rel="nofollow"&gt;Commented in last 8 days&lt;/a&gt;. You may also want to change your column layout to include the changed time (alas. I don't know of any way to create a custom view for just a single search, *sigh*). &lt;/p&gt;

&lt;h2&gt;Bonus: search for bad attachment types&lt;/h2&gt;

&lt;p&gt; Another problem I hit a lot is that python code tracebacks in plain text, but users often leave the "crash" as the default application/octet-stream ... which means it can't be viewed. Really I'd like BugZilla to notice this, and fix it. But after seeing the above search I created: &lt;a href="https://bugzilla.redhat.com/buglist.cgi?cmdtype=dorem&amp;amp;remaction=run&amp;amp;namedcmd=YUM%2A%20weird%20attachments&amp;amp;sharer_id=73713" rel="nofollow"&gt;YUM* weird attachments&lt;/a&gt;, which searches for any BZ (belonging to the yum group) that has an attachment of type application-octet/stream that isn't obsoleted. Then it's a simple matter of manually fixing the problem bugs.
&lt;/p&gt;

&lt;h3&gt;Update: 2009-04-30 --&lt;/h3&gt; Thanks to &lt;lj-user&gt;mcepl&lt;/lj-user&gt;, I've now fixed the "shared" links so the should work for everyone != me :).</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:6716</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/6716.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=6716"/>
    <title>Why trusted third party repos. will always be a bad idea</title>
    <published>2009-04-03T19:04:20Z</published>
    <updated>2010-01-22T22:49:48Z</updated>
    <category term="fail"/>
    <category term="package management"/>
    <category term="yum"/>
    <category term="pk"/>
    <category term="apt"/>
    <category term="fedora"/>
    <content type="html">&lt;h1&gt;Why not make third party repos. first class&lt;/h1&gt;
&lt;p&gt; Every now and again, someone takes a look at apt/yum/zypper/smart/PK/whatever and decides that although they have support for third party repos. it's "too annoying" for third parties to get users or for users to use them and so this is a problem which needs to be fixed. Another way this is presented is that the package managers should support "One Click Install". I will hopefully explain (once and for all) why this isn't a problem, and what third parties can do to get what they actually want (to make their users lives easier). &lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;h2&gt; This is not a new problem &lt;/h2&gt;
&lt;p&gt; A long time ago now, I used Debian-2.2 on my desktop+server and it was good. But then the desktop seemed old and luckily for me Ximian came out with a GNOME desktop for Debian-2.2. So trusting that Ximian had tested everything, and worked to get the latest nice desktop bits into my stable Debian-2.2 I added their repo. and upgraded. I noted they they were replacing large parts of the desktop, and not just adding packages, but this was pretty much what I expected so it didn't bother me. And life was good, I now had a stable system and a new desktop.&lt;/p&gt;
&lt;p&gt; Then came the point where I wanted to "distro. upgrade" from Debian-2.2 to Debian-3.0, so I did what everyone does "apt-get update &amp;amp;&amp;amp; apt-get dist-upgrade" and I expected that the Ximian stuff was probably going to get lost, as this is what happened on all Red Hat CD updates I'd ever done. What actually happened though was apt-get tried to resolve dependencies for a while, and then said it couldn't do it and gave up. Life was not good. &lt;/p&gt;

&lt;h2&gt; The core problem is distributed database synchronization &lt;/h2&gt;

&lt;p&gt; The core problem is that "package management" is actually "database management", where moving from pkgA-1 to pkgA-2 is more about database synchronization than anything else. So when you add a "third party repo." you now have "distributed database management/synchronization". In simple terms. this means that you can test that Debian-X to Debian-Y works, or Fedora-X to Fedora-Y but this testing will not apply to Debian-X1 or Fedora-X1, and while you could expand testing to cover those *-X1 cases (at great cost) noone knows how to make it work for all of -X1, -X2, -X3, ... -XN. &lt;/p&gt;

&lt;h2&gt; Ignoring the DB we still have the trust problem &lt;/h2&gt;

&lt;p&gt; Even if we could magically solve the distributed database problem there are significant problems with splitting your database beyond a certain point. At the inevitiable endgame, let's say that all the "large" applications have their own repository (openoffice, evolution, gnome, firefox, kde, apache-httpd, postgresql, mysql, etc. etc.) Now whenever you want to update your view of this distributed database you have to contact N different repositories instead of the 2 you have for Fedora. Making this usable for 10 repos. is a significant amount of work, making it usable for 100 or even 1,000 repos. is a huge amount of work.
&lt;/p&gt;
&lt;p&gt; Also the quality control of the packages that can get onto your system is now distributed, because the quality of any package on the system is the minimum of that applied on all the repositories you have available. This is also true of the reliability/security, so instead of a single point of failure for most users you have N single points of failure.
&lt;/p&gt;


&lt;h2&gt; But what about "simple" packages and "semi-trusted third party repos" &lt;/h2&gt;

&lt;p&gt; One solution to the giant distributed database problem is to have "trusted third party repos." not participate in the database. They say something like "I need at least LSB-blah" but have no other dependencies. In theory this is workable, but to do this someone has to write a lot of code in all the packager managers that handle these semi-trusted repos. And even after these code changes to sandbox the packages from the semi-trusted repos. you &lt;b&gt;still&lt;/b&gt; have a significant portion of the trust problems to do with managing a lot of repos. and dealing with the network etc. &lt;/p&gt;

&lt;p&gt; But IMO the most damning problem with this approach is the number of uses it could be put to, because the packages within these semi-trusted repos. will have much less features than first class packages used in trusted repos. This means that each application in these repos. would have to have it's own copy of everything outside of LSB, so you might end up with 10 copies of FOO until it gets into the next LSB. And the semi-trusted repo. would have to deal with security updates for all of those things itself (and you have to trust it to keep doing so, in a timely manner).&lt;/p&gt;

&lt;h2&gt; What about if I just ignore all of the above &lt;/h2&gt;

&lt;p&gt; Even if you ignored or solved all of the above problems, a core point remains that you need a chain of trust starting from your main distribution. This could be Fedora installing the *-release file for the repo. you want, or some large amount of new code to do basically the same thing. The problem is that Fedora doesn't want to provide that chain of trust, for legal and other non-technical reasons, so you are back at the same place you've started from. &lt;/p&gt;

&lt;h2&gt; So why all the code to support multiple repos. &lt;/h2&gt;

&lt;p&gt; There are a number of cases where multiple repos. work well, and so is a useful feature to have. However in these cases the extra repos. should rarely be classed as "third party". For instance Fedora has a "release" repo. and an "updates" repo., to cut down on metadata, and an "updates-testing" repo. so users can easily turn that on or off. RHEL also contains many extension repos. for specific applications like clustering, or updated MySQL ... but again any problems here can be solved by changes to the "main" repo. and these specified sets of repos. are tested together. &lt;/p&gt;

&lt;p&gt; Something that isn't obviuosly an extension repo. is the rpmfusion repos. for Fedora. But in reality they try and act as much like an extension repo., for US legally problematic packages, as possible. For instance while they are controlled outside of Fedora (and so, from the Fedora infrastructure POV are third party and untrusted) some of the people who control them are part of the Fedora community and so do the integration testing and can help fix any problems caused by using them with Fedora. They also have similar package review guidelines to Fedora, so the quality aspect is maintained as much as possible. However even with these constraints there are still some problems due to the database being distributed and not synchronized.&lt;/p&gt;

&lt;p&gt; It's also somewhat common for specific companies/groups to internally have repos., either for custom builds of distro. applications or addon applications they wrote/bought. However these act exactly like extension repos. in that someone is charged with doing all the integration testing and they have the trust problem solved due to it being controlled by them. These repos. can also easily be integrated into the installation, so they don't have the problems people are trying to work around with third party repos. &lt;/p&gt;

&lt;h2&gt; Possible solution for third parties &lt;/h2&gt;

&lt;p&gt; This does imply a possible solution for random third party repo. providers that want this problem solved, pool your resources and join the upstream community (to help with integration). An obvious choice is to join the work the rpmfusion community is doing (and also join the Fedora community). This way 100 or 1,000 third party package providers could all provide automatic updates etc. but with some implied level of QA/trust/etc. so that users could tell the good third party from the bad. This would significantly improve the user experience of getting packages from that third party, helping both sides. Another useful property of this is that nothing has to change in any of the package managers. &lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:6439</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/6439.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=6439"/>
    <title>A quick explanation of package sizes in yum and rpm</title>
    <published>2008-09-02T21:54:20Z</published>
    <updated>2008-09-02T22:09:59Z</updated>
    <category term="rpm"/>
    <category term="yum"/>
    <category term="reference"/>
    <content type="html">&lt;p&gt; It's pretty common to think that a specific thing always has a specific size, and people tend to think of an "rpm package" as a single object thus. the it's common to ask what is "&lt;u&gt;the&lt;/u&gt; size of an rpm". However if you have a 1MB text file, and gzip compresses it to 50KB which you then upload to a HTTP server you now have at least 3 different sizes: text size; compressed size and upload size (includes HTTP headers etc.) and asking for &lt;u&gt;the&lt;/u&gt; size. So it is with rpm packages, and their many sizes.
&lt;/p&gt;

&lt;h3&gt; The three common sizes of an rpm package &lt;/h3&gt;

&lt;p&gt;
 I'm going to use the yum package object notation to explain the common different sizes, as those are the easiest to see/use (&lt;a href="http://illiterat.livejournal.com/6254.html" rel="nofollow"&gt;see my previous post on how to simply get package objects from yum&lt;/a&gt;):
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt; An .rpm file package has pkg.archivesize (rpm TAG RPMTAG_ARCHIVESIZE), pkg.installedsize (rpm TAG RPMTAG_SIZE) and (for &lt;b&gt;yum available&lt;/b&gt; package objects) pkg.packagesize (a stat of the .rpm file).&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt;
&lt;p&gt; Installed rpm packages have pkg.archivesize (rpm TAG RPMTAG_ARCHIVESIZE) and pkg.packagesize (rpm TAG RPMTAG_SIZE). pkg.installedsize doesn't exist, because an installed package can't be installed.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt; How are those values calculated? &lt;/h3&gt;

&lt;p&gt; rpm calculates the RPMTAG_SIZE value as the simple summation of the sizes of all the files within the rpm. rpm calculates the RPMTAG_ARCHIVESIZE value as the value of the cpio archive within the rpm, after decompression. These values are often very close. As I said above the pkg.packagesize value for .rpm files is just that value returned by stat(2).
&lt;/p&gt;

&lt;h3&gt; So, what is pkg.size and why does it change? &lt;/h3&gt;

&lt;p&gt; Within yum there is a pkg.size, which maps to pkg.packagesize, which is used in all UI code that just wants to know "&lt;u&gt;the&lt;/u&gt; size" of a package. This value has the property that the value you get for "need to download X bytes to install" and "will free up X bytes on removal" is correct, which is what most people want to see most of the time. However using this value does mean that the "&lt;u&gt;the&lt;/u&gt; size of the rpm package" changes after you install an rpm, so it can be confusing if you try and compare pkg.size values between installed and available packages (either via. yum info, or looking at the values presented when installing a new kernel and removing an old one, say).
&lt;/p&gt;

&lt;h3&gt; So, what's the best way to compare &lt;u&gt;the&lt;/u&gt; size of an rpm package? &lt;/h3&gt;

&lt;p&gt; As you can see above, archivesize is the same for an available package and an installed one. So if you install the yum-lsit-data plugin, you can use "yum --showduplicates info-archive-sizes &amp;lt;pkgs&amp;gt;". Also you can always compare with just the available packages (and not installed ones), for instance "repoquery --show-duplicates --qf '%{nevra} %{archivesize} %{size}\n' &amp;lt;pkgs&amp;gt;"
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:6254</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/6254.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=6254"/>
    <title>Programming with Yum in 5 minutes, or so</title>
    <published>2008-08-30T06:12:03Z</published>
    <updated>2008-08-30T07:01:35Z</updated>
    <category term="python"/>
    <category term="yum"/>
    <category term="example"/>
    <content type="html">&lt;p&gt; There are a lot of lines of code in yum, and it can be somewhat intimidating at first glace. However a significant amount of effort has been made to make simple things easy, and the hard things not so hard. The start of any code using yum, will almost certainly have these four lines (and always the first and last one :).&lt;/p&gt;

&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;    1 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;#! /usr/bin/python -tt&lt;/span&gt;
&lt;span style="color:#555555"&gt;    2 &lt;/span&gt;
&lt;span style="color:#555555"&gt;    3 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; os
&lt;span style="color:#555555"&gt;    4 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; sys
&lt;span style="color:#555555"&gt;    5 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; yum
&lt;/pre&gt;

&lt;p&gt; Those lines just tell python, you'd you'd like to be able to use the yum code, and some stuff for the OS. Next the first bit of real code, and something which is also in almost every piece of code using yum: &lt;/p&gt;


&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;    7 &lt;/span&gt;yb &lt;span style="color:#000000"&gt;=&lt;/span&gt; yum&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;YumBase&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;/pre&gt;

&lt;p&gt; This creates a yum instance, that you can work with. Then one more piece, that is very useful:&lt;/p&gt;


&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;    9 &lt;/span&gt;yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;conf&lt;span style="color:#000000"&gt;.&lt;/span&gt;cache &lt;span style="color:#000000"&gt;=&lt;/span&gt; os&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;geteuid&lt;/span&gt;&lt;span style="color:#000000"&gt;() !=&lt;/span&gt; &lt;span style="color:#2928ff"&gt;0&lt;/span&gt;
&lt;/pre&gt;

&lt;p&gt; This just tells the yum instance not to try and update any of it's data, as the caller of the script probably hasn't got the permissions to do so.&lt;/p&gt;
&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; Now we have a real yum object that we can do things with, the three most useful parts to access are &lt;b&gt;pkgSack&lt;/b&gt;, &lt;b&gt;rpmdb&lt;/b&gt; and &lt;b&gt;repos&lt;/b&gt;. The first two basically act the same, but rpmdb performs queries based on the installed packages on the local machine and pkgSack performs them against all the enabled (normally remote) repositories. The repos attribute is almost always used for one of three things, calling repos.enableRepo(), repos.disableRepo() and less often repos.listEnabled(). The latter for if you need to set/override some specific configuration for the repos.
&lt;/p&gt;

&lt;p&gt; The &lt;b&gt;pkgSack&lt;/b&gt; and &lt;b&gt;rpmdb&lt;/b&gt; attributes have a fairly large number of functions you can call, most of which return "package objects" these are the main things you work with most in yum code. Probably the most useful functions to get those package objects are: searchNevra(), returnPackages() and searchPrimaryFields(). There are also some optimized varients like, searchNames() and returnNewestByNameArch(). Some examples would be:
&lt;/p&gt;

&lt;h3&gt;Simple version of "yum list" command&lt;/h3&gt;

&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;   11 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Get the repository package objects matching the passed arguments&lt;/span&gt;
&lt;span style="color:#555555"&gt;   12 &lt;/span&gt;pkgs &lt;span style="color:#000000"&gt;=&lt;/span&gt; yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;pkgSack&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;returnNewestByNameArch&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;patterns&lt;span style="color:#000000"&gt;=&lt;/span&gt;sys&lt;span style="color:#000000"&gt;.&lt;/span&gt;argv&lt;span style="color:#000000"&gt;[&lt;/span&gt;&lt;span style="color:#2928ff"&gt;1&lt;/span&gt;&lt;span style="color:#000000"&gt;:])&lt;/span&gt;
&lt;span style="color:#555555"&gt;   13 &lt;/span&gt;
&lt;span style="color:#555555"&gt;   14 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;for&lt;/span&gt; pkg &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; pkgs&lt;span style="color:#000000"&gt;:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   15 &lt;/span&gt;    &lt;span style="color:#000000; font-weight:bold"&gt;print&lt;/span&gt; &lt;span style="color:#ff0000"&gt;&amp;quot;%s: %s&amp;quot;&lt;/span&gt; &lt;span style="color:#000000"&gt;% (&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;summary&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;/pre&gt;

&lt;h3&gt;Simple stats. gathering from installed packages&lt;/h3&gt;

&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;   17 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Find the ten biggest installed packages&lt;/span&gt;
&lt;span style="color:#555555"&gt;   18 &lt;/span&gt;pkgs &lt;span style="color:#000000"&gt;=&lt;/span&gt; yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;rpmdb&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;returnPackages&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;   19 &lt;/span&gt;pkgs&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;sort&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;key&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;lambda&lt;/span&gt; x&lt;span style="color:#000000"&gt;:&lt;/span&gt; x&lt;span style="color:#000000"&gt;.&lt;/span&gt;size&lt;span style="color:#000000"&gt;,&lt;/span&gt; reverse&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;True&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   20 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;print&lt;/span&gt; &lt;span style="color:#ff0000"&gt;&amp;quot;Top ten installed packages:&amp;quot;&lt;/span&gt;
&lt;span style="color:#555555"&gt;   21 &lt;/span&gt;done &lt;span style="color:#000000"&gt;=&lt;/span&gt; &lt;span style="color:#830000"&gt;set&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;   22 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;for&lt;/span&gt; pkg &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; pkgs&lt;span style="color:#000000"&gt;:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   23 &lt;/span&gt;    &lt;span style="color:#000000; font-weight:bold"&gt;if&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;name &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; done&lt;span style="color:#000000"&gt;:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   24 &lt;/span&gt;        &lt;span style="color:#000000; font-weight:bold"&gt;continue&lt;/span&gt;
&lt;span style="color:#555555"&gt;   25 &lt;/span&gt;    done&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;add&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;name&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   26 &lt;/span&gt;    &lt;span style="color:#000000; font-weight:bold"&gt;print&lt;/span&gt; &lt;span style="color:#ff0000"&gt;&amp;quot;%s: %sMB&amp;quot;&lt;/span&gt; &lt;span style="color:#000000"&gt;% (&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;size &lt;span style="color:#000000"&gt;/ (&lt;/span&gt;&lt;span style="color:#2928ff"&gt;1024&lt;/span&gt; &lt;span style="color:#000000"&gt;*&lt;/span&gt; &lt;span style="color:#2928ff"&gt;1024&lt;/span&gt;&lt;span style="color:#000000"&gt;))&lt;/span&gt;
&lt;span style="color:#555555"&gt;   27 &lt;/span&gt;    &lt;span style="color:#000000; font-weight:bold"&gt;if&lt;/span&gt; &lt;span style="color:#830000"&gt;len&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;done&lt;span style="color:#000000"&gt;) &amp;gt;=&lt;/span&gt; &lt;span style="color:#2928ff"&gt;10&lt;/span&gt;&lt;span style="color:#000000"&gt;:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   28 &lt;/span&gt;        &lt;span style="color:#000000; font-weight:bold"&gt;break&lt;/span&gt;
&lt;/pre&gt;
&lt;br&gt;&lt;br&gt;

&lt;h2&gt;Slightly more advanced topics&lt;/h2&gt;

&lt;p&gt; After playing with yum code for a little bit, you'll probably experience a function which you might think would return a "package object" but doesn't returning a tuple of data instead. The two common tuples within yum are the "&lt;b&gt;package tuple&lt;/b&gt;" and the "&lt;b&gt;dependency tuple&lt;/b&gt;" which are:&amp;lt;/b&amp;gt;

&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;   30 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Pacakge tuple:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   31 &lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;name&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;arch&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;epoch&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;version&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;release&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   32 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Dependency (or the Provides/Requires/Conflicts/Obsoletes (PRCO)) tuple:&lt;/span&gt;
&lt;span style="color:#555555"&gt;   33 &lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;name&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'EQ'&lt;/span&gt;&lt;span style="color:#000000"&gt;, (&lt;/span&gt;pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;epoch&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;version&lt;span style="color:#000000"&gt;,&lt;/span&gt; pkg&lt;span style="color:#000000"&gt;.&lt;/span&gt;release&lt;span style="color:#000000"&gt;))&lt;/span&gt;
&lt;/pre&gt;

&lt;p&gt; After that you'll probably start playing with the "&lt;b&gt;transaction info&lt;/b&gt;" (yb.tsInfo), and the "install", "update" and "remove" functions of YumBase. So you can change the system as well as query it. Then you'll want to present your information in a way that looks as good as a normal yum command (using the "internal" output module), although that is less unsupported. A somewhat useful example might be:&lt;/p&gt;

&lt;h3&gt;"Clever" way to manually update almost all the metadata for any usable repos. installed&lt;/h3&gt;

&lt;pre style="color:#000000; background-color:#ffffff; font-size:10pt; font-family:&amp;#39;Courier New&amp;#39;;"&gt;&lt;span style="color:#555555"&gt;    1 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;#! /usr/bin/python -tt&lt;/span&gt;
&lt;span style="color:#555555"&gt;    2 &lt;/span&gt;
&lt;span style="color:#555555"&gt;    3 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; os
&lt;span style="color:#555555"&gt;    4 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; sys
&lt;span style="color:#555555"&gt;    5 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; yum
&lt;span style="color:#555555"&gt;    6 &lt;/span&gt;
&lt;span style="color:#555555"&gt;    7 &lt;/span&gt;yb &lt;span style="color:#000000"&gt;=&lt;/span&gt; yum&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;YumBase&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;    8 &lt;/span&gt;
&lt;span style="color:#555555"&gt;    9 &lt;/span&gt;yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;conf&lt;span style="color:#000000"&gt;.&lt;/span&gt;cache &lt;span style="color:#000000"&gt;=&lt;/span&gt; os&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;geteuid&lt;/span&gt;&lt;span style="color:#000000"&gt;() !=&lt;/span&gt; &lt;span style="color:#2928ff"&gt;0&lt;/span&gt;
&lt;span style="color:#555555"&gt;   10 &lt;/span&gt;
&lt;span style="color:#555555"&gt;   11 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;from&lt;/span&gt; urlgrabber&lt;span style="color:#000000"&gt;.&lt;/span&gt;progress &lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; TextMeter
&lt;span style="color:#555555"&gt;   12 &lt;/span&gt;
&lt;span style="color:#555555"&gt;   13 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Use the &amp;quot;internal&amp;quot; output mode of yum's cli&lt;/span&gt;
&lt;span style="color:#555555"&gt;   14 &lt;/span&gt;sys&lt;span style="color:#000000"&gt;.&lt;/span&gt;path&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;insert&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;&lt;span style="color:#2928ff"&gt;0&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'/usr/share/yum-cli'&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   15 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;import&lt;/span&gt; output
&lt;span style="color:#555555"&gt;   16 &lt;/span&gt;
&lt;span style="color:#555555"&gt;   17 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Try not to be annoying if run from cron etc.&lt;/span&gt;
&lt;span style="color:#555555"&gt;   18 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;if&lt;/span&gt; sys&lt;span style="color:#000000"&gt;.&lt;/span&gt;stdout&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;isatty&lt;/span&gt;&lt;span style="color:#000000"&gt;():&lt;/span&gt;
&lt;span style="color:#555555"&gt;   19 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;setProgressBar&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;&lt;span style="color:#010181"&gt;TextMeter&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;fo&lt;span style="color:#000000"&gt;=&lt;/span&gt;sys&lt;span style="color:#000000"&gt;.&lt;/span&gt;stdout&lt;span style="color:#000000"&gt;))&lt;/span&gt;
&lt;span style="color:#555555"&gt;   20 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;callback &lt;span style="color:#000000"&gt;=&lt;/span&gt; output&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;CacheProgressCallback&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;   21 &lt;/span&gt;    yumout &lt;span style="color:#000000"&gt;=&lt;/span&gt; output&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;YumOutput&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;   22 &lt;/span&gt;    freport &lt;span style="color:#000000"&gt;= (&lt;/span&gt; yumout&lt;span style="color:#000000"&gt;.&lt;/span&gt;failureReport&lt;span style="color:#000000"&gt;, (), {} )&lt;/span&gt;
&lt;span style="color:#555555"&gt;   23 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;setFailureCallback&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt; freport &lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   24 &lt;/span&gt;
&lt;span style="color:#555555"&gt;   25 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Enable all the repos. a user might want to use and sync. the metadata.&lt;/span&gt;
&lt;span style="color:#555555"&gt;   26 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# Note this needs to be done before the repositories are used.&lt;/span&gt;
&lt;span style="color:#555555"&gt;   27 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;for&lt;/span&gt; name &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; &lt;span style="color:#000000"&gt;(&lt;/span&gt;&lt;span style="color:#ff0000"&gt;'updates-testing'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'rawhide'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'livna'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'adobe-linux-i386'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt;
&lt;span style="color:#555555"&gt;   28 &lt;/span&gt;             &lt;span style="color:#ff0000"&gt;'brew'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'rhts'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'koji-static'&lt;/span&gt;&lt;span style="color:#000000"&gt;):&lt;/span&gt;
&lt;span style="color:#555555"&gt;   29 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;enableRepo&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;name &lt;span style="color:#000000"&gt;+&lt;/span&gt; &lt;span style="color:#ff0000"&gt;','&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   30 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;for&lt;/span&gt; repo &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;listEnabled&lt;/span&gt;&lt;span style="color:#000000"&gt;():&lt;/span&gt;
&lt;span style="color:#555555"&gt;   31 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;enableRepo&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;repo&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#830000"&gt;id&lt;/span&gt; &lt;span style="color:#000000"&gt;+&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'-source'&lt;/span&gt;    &lt;span style="color:#000000"&gt;+&lt;/span&gt; &lt;span style="color:#ff0000"&gt;','&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   32 &lt;/span&gt;    yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;enableRepo&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;repo&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#830000"&gt;id&lt;/span&gt; &lt;span style="color:#000000"&gt;+&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'-debuginfo'&lt;/span&gt; &lt;span style="color:#000000"&gt;+&lt;/span&gt; &lt;span style="color:#ff0000"&gt;','&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   33 &lt;/span&gt;yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;doSetup&lt;/span&gt;&lt;span style="color:#000000"&gt;()&lt;/span&gt;
&lt;span style="color:#555555"&gt;   34 &lt;/span&gt;&lt;span style="color:#000000; font-weight:bold"&gt;for&lt;/span&gt; repo &lt;span style="color:#000000; font-weight:bold"&gt;in&lt;/span&gt; yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;listEnabled&lt;/span&gt;&lt;span style="color:#000000"&gt;():&lt;/span&gt;
&lt;span style="color:#555555"&gt;   35 &lt;/span&gt;    repo&lt;span style="color:#000000"&gt;.&lt;/span&gt;mdpolicy        &lt;span style="color:#000000"&gt;=&lt;/span&gt; &lt;span style="color:#ff0000"&gt;'group:main'&lt;/span&gt;
&lt;span style="color:#555555"&gt;   36 &lt;/span&gt;    repo&lt;span style="color:#000000"&gt;.&lt;/span&gt;metadata_expire &lt;span style="color:#000000"&gt;=&lt;/span&gt; &lt;span style="color:#2928ff"&gt;0&lt;/span&gt;
&lt;span style="color:#555555"&gt;   37 &lt;/span&gt;    repo&lt;span style="color:#000000"&gt;.&lt;/span&gt;repoXML
&lt;span style="color:#555555"&gt;   38 &lt;/span&gt;&lt;span style="color:#838183; font-style:italic"&gt;# This is somehwat &amp;quot;magic&amp;quot;, it unpacks the metadata making it usable.&lt;/span&gt;
&lt;span style="color:#555555"&gt;   39 &lt;/span&gt;yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;populateSack&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;mdtype&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#ff0000"&gt;'metadata'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; cacheonly&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#2928ff"&gt;1&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;span style="color:#555555"&gt;   40 &lt;/span&gt;yb&lt;span style="color:#000000"&gt;.&lt;/span&gt;repos&lt;span style="color:#000000"&gt;.&lt;/span&gt;&lt;span style="color:#010181"&gt;populateSack&lt;/span&gt;&lt;span style="color:#000000"&gt;(&lt;/span&gt;mdtype&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#ff0000"&gt;'filelists'&lt;/span&gt;&lt;span style="color:#000000"&gt;,&lt;/span&gt; cacheonly&lt;span style="color:#000000"&gt;=&lt;/span&gt;&lt;span style="color:#2928ff"&gt;1&lt;/span&gt;&lt;span style="color:#000000"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt; Hopefully that will go some way to giving you an overview of how you can use the yum API to perform queries or tasks that would otherwise be very difficult. For further information you can use the &lt;b&gt;help&lt;/b&gt; feature of ipython to look at docstrings for the variuos components, and even use the TAB complete feature of ipython to see all the available attributes of packages, repos, pkgSack or the YumBase itself.
&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:6119</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/6119.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=6119"/>
    <title>Abusing the depsolver for fun/Sudoku</title>
    <published>2008-08-25T03:58:19Z</published>
    <updated>2008-08-25T17:31:41Z</updated>
    <category term="yum"/>
    <category term="games"/>
    <category term="depsolving"/>
    <category term="hacks"/>
    <content type="html">&lt;p&gt; After seeing the article on &lt;a href="http://algebraicthunk.net/~dburrows/blog/entry/package-management-sudoku/" rel="nofollow"&gt;sudoku in apt/dpkg&lt;/a&gt;, I naturally thought "I wonder how you can do that in rpm". The big differences (from what I read/understood of the article) being that rpm doesn't mind if two packages with the same virtual provide are installed at once where dpkg does (and their "game" relies on that).
&lt;/p&gt;

&lt;p&gt; The natural solution seemed to be to use provides and conflicts, although I didn't think of other forms for long before just trying to implement that one :). The basic premise is to have 9 * 9 * 9 packages, each providing cell_1.1.1 through cell_9.9.9 (which is the row, column and value) and cell_value_1.1 through cell_value_9.9 (just the row and column). Then require all 82 cell_values, with any specific cells. You then just add conflicts in each package obeying the rules for block, row and column. Simple.
&lt;/p&gt;

&lt;p&gt;
 I was pretty sure yum would fail at solving the sudoku puzzles, because it basically requires that when you get multiple answers to a provides question (what package provides cell_1.1) you need to repeatedly narrow it down based on &lt;b&gt;future information&lt;/b&gt;. This is kind of on the longer term. road map for yum so that it'll get better results, in certain edge cases, for real packages in real repositories.
&lt;/p&gt;

&lt;p&gt; Read on for scripts/repos. and results (spoiler, yum does fail and smart succeeds).
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; I've made the &lt;a href="http://fedorapeople.org/~james/yum/sudoku/" rel="nofollow"&gt;scripts and repositories&lt;/a&gt; for the game and puzzles available. So it's as simple as possible to try it out.
&lt;/p&gt;

&lt;p&gt; The puzzle_complete_* packages are complete puzles (all 82 cells prefilled, and the "game" included). The puzzle_(medium|hard)_* packages are just random games I copied from gnome-sudoku on medium/hard setting. Note that you can't actually do the "install" as the packages don't exist in anything other than metadata, but that's probably a good thing.
&lt;/p&gt;

&lt;p&gt; Results are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt; yum-3.2.18 just picks some "random" values for the missing cells, and then complains about the resulting conflicts.
&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt; apt-0.5.15lorg3.94-3.fc9.x86_64 seems to decide a bunch of the cells are bad, and gives an "unmet dependencies" error.
&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt; smart-0:0.52-54.fc9.x86_64 solves the medium puzzles! And it hasn't given up on one of the non-guessing hard puzzles yet, although it's been over 20 minutes.
&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; smart solves hard_1 (no guessing required) and hard_2 (which requires guessing), taking only 3-4 minutes for hard_2 although hard_1 takes it just over 30 minutes. Interesting.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:5720</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/5720.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=5720"/>
    <title>Not putting all your pkgs in one repo. (yum edition)</title>
    <published>2008-08-13T01:43:58Z</published>
    <updated>2010-01-22T22:50:24Z</updated>
    <category term="best practise"/>
    <category term="package management"/>
    <category term="yum"/>
    <content type="html">&lt;p&gt; As you create packages for private use, the question will eventually come up "where do I put these". The choice is obvious for the first package, just create a repo. (using createrepo -d) and distribute the my.repo file. However as you create more packages the answer &lt;b&gt;should&lt;/b&gt; expand to having multiple repos. Different package managers have a different ideas about what should go inside a single repository, and the corollary to that is if you take the "best practice" for some other package manager and apply it to yum the results might not be all they could be. This posting will try and lay out the best practises for yum (3.2.z), and some of the reasons for them.
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;

&lt;p&gt; Probably the most common problem I see is people who think "&lt;b&gt;yum-priorities&lt;/b&gt;" and/or "&lt;b&gt;yum-versionlock&lt;/b&gt;" is a good solution to their problem, it is not. Another warning sign is the use of exclude and/or includepkgs, which are just a faster version of the above. While those plugins adds certain features that make certain things &lt;i&gt;possible&lt;/i&gt;, needing those features is &lt;b&gt;always&lt;/b&gt; a sign that you've got the separation of packages into repositories &lt;b&gt;wrong&lt;/b&gt; (or something is just plain broken).
&lt;/p&gt;

&lt;p&gt;
 One of the least worst problems is they are not scalable, the basic implementation gets all the "good" packages and then excludes all the "bad" packages. However the real problems start due to the shortcuts they take so they can be faster, for instance if pkgA depends on pkgB and only pkgB is a "good" / "bad" package then pkgA will now either having a missing dependency or will end up running with something it wasn't built with/for. As more packages are added to the repositories it's more likely problems will arise.
&lt;/p&gt;

&lt;p&gt; The correct solution to problems that suggest yum-priorities, having packages that are an extension of an upstream and packages that override others in the upstream, is to split your one repository into more than one. Here is a quick list of reasons you'd want more than one repository:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;b&gt;debuginfo&lt;/b&gt;: 
&lt;p&gt; Debuginfo packages are large, and rarely needed. Also the yum-utils command, debuginfo-install, will automatically install them correctly if they are in a repo. with the same base repoid but with "-debuginfo" appended.
&lt;/p&gt;

&lt;/li&gt;

&lt;li&gt;&lt;b&gt;source&lt;/b&gt;: 
&lt;p&gt; Source packages also tend to be large, and also rarely needed. In this case the yum-utils command, yumdownloader --source, will automatically download them correctly if they are in a repo. with the same base repoid but with "-source" appended.&lt;/p&gt;

&lt;/li&gt;

&lt;li&gt;&lt;b&gt;Binary architectures&lt;/b&gt;: 
&lt;p&gt; Putting .i386 and .ppc packages in the same repository is a little more convenient on the server side, but it means every action on the client side has to filter out all the packages of the wrong architecture. Also, it means you have almost no control over multilib issues on the client (.i386 and .x86_64 on a x86_64 machine, for example).&lt;/p&gt;

&lt;/li&gt;

&lt;li&gt;&lt;b&gt;stability or freshness&lt;/b&gt;: 
&lt;p&gt; While yum can install old packages, and move to newer packages that aren't the latest versions, it's support for this is much weaker than the more expected case of installing or moving to the latest available package (for instance dependencies of installed packages will currently always move to the latest available). So it makes sense to have myrepo-release, myrepo-stable, myrepo-rc, myrepo-testing, myrepo-development type repositories.&lt;/p&gt;
&lt;p&gt; Yum's ability to allow you to easily &lt;b&gt;temporarily add a repository&lt;/b&gt; with --enablerepo=myrepo-testing (or use yum-aliases to create a synonym) or even use yum-tmprepo, makes this a much more viable alternative than it otherwise would be. As this means you can have several repositories that are configured to be disabled but then you temporarily enable them (to install a single package etc.).
&lt;/p&gt;

&lt;/li&gt;

&lt;li&gt;&lt;b&gt;enhances or overrides&lt;/b&gt;: 
&lt;p&gt; Sometimes you want extra packages because they are not available elsewhere and sometimes you want custom "upstream" packages (for specific local patches etc.). You should not mix these two types of packages, this also makes it easier to answer questions like "Do our extension packages rely on any of our custom upstream packages".&lt;/p&gt;
&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt; One of the main things that isn't listed above is "security updates" vs. "bugfix updates", this type of update is often split in other package managers. However in yum you should just create (or generate) updateinfo.xml data for the repository, and then use the yum-security plugin to select security updates (or fixes for specific CVE/BZs/etc.)
&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:5571</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/5571.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=5571"/>
    <title>Understanding groups in yum</title>
    <published>2008-08-08T00:15:26Z</published>
    <updated>2010-01-22T22:51:27Z</updated>
    <category term="faq"/>
    <category term="yum"/>
    <category term="fedora"/>
    <content type="html">&lt;p&gt; Every now and again someone will will ask a question about groups in yum that amounts to: 
&lt;/p&gt;


&lt;blockquote&gt;&lt;p&gt;If I do "&lt;b&gt;groupinstall xyz&lt;/b&gt;" and then I do "&lt;b&gt;groupremove xyz&lt;/b&gt;" why do I not end up where I started.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; The main problem here is thinking of "groups" as objects that are installed and/or removed by yum, in fact currently the &lt;b&gt;only&lt;/b&gt; way yum stores data on your computer is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; via. it's caches of network data (can be removed at any time)&lt;/li&gt;
&lt;li&gt; via. it's log file (can be removed at any time)&lt;/li&gt;
&lt;li&gt; via. it's transaction log (can be removed at any time, although yum won't be able to recover from transaction errors)&lt;/li&gt;
&lt;li&gt; via. rpm&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...and given that rpm only installs pacakges and that doesn't include extra package data like which "groups" the package is in, it's easy to realize that yum &lt;b&gt;cannot&lt;/b&gt; perform the groupinstall/groupremove operations using that model (even if that seems like the "correct" thing to do).&lt;/p&gt;

&lt;h3&gt; What really happens then? &lt;/h3&gt;

&lt;p&gt; The simple way of thinking about it is that each group is a collection of package names, and on a groupinstall/groupremove yum collects all the &lt;b&gt;package names&lt;/b&gt; in the group and tries to install each one (or remove each one). This works exactly as if you had run groupinfo, put all the names in a text file and run that through the "yum shell" command, there is very little magic in how this works and is a simple model (once you forget the "obvious" model as above). One way of thinking about it is that groups are more like "tags" in del.icio.us or livejournal etc.&lt;/p&gt;

&lt;p&gt; There is a &lt;small&gt;little&lt;/small&gt; more complication in that each group actually has four lists of package names, but groupinfo also displays the different lists in the groups so again the model is the same as the text file example. 
&lt;/p&gt;

&lt;p&gt;
 So as you might expect from the above: if you have x, y and z installed; then groupinstall "foo" which contains a, b and y; then groupremove "foo" -- you'll end up with x and z.
&lt;/p&gt;

&lt;h3&gt; But what about if we just add some magic? &lt;/h3&gt;

&lt;p&gt; The next question people ask (usually without fully understanding the above) is something like ok so groupremove will remove things I had installed before a groupinstall ... but if I do "groupinstall GRP1 GRP2" and then "groupremove GRP1" it should be easy to just keep any packages in GRP1 and GRP2, no?
&lt;/p&gt;

&lt;p&gt; And this does sound easy to implement, just only remove files that are only in GRP1. Except that packages can be in any number of groups. So consider the first example again, the question implicitly assumes that "x" getting remove should be based on whether "x" happens to be in another group or not. This little bit of magic in yum would then become &lt;b&gt;very magic&lt;/b&gt; for the user, and it would be very hard to tell what a groupremove command is going to do.
&lt;/p&gt;

&lt;h3&gt; But, but, usability! Do what the users expect! &lt;/h3&gt;

&lt;p&gt; In my opinion most of the problem here is with the way applications present the concept of "group" to the user, including yum itself (although noone would like the new command name for the cmd line client). GUI package management applications that use yum group definitions shouldn't present the user with an option to "install group X" instead the operation should be presented to the user as "install/remove all packages in group X" with the option to (de)select only parts of the list.
&lt;/p&gt;

&lt;p&gt; The other thing to do is for people to fix their groups so that "common" applications aren't in weird groups, so that groupremove on those groups is a useful operation again.
&lt;/p&gt;

&lt;h3&gt; Cool, groups are interesting but I can't control the groups from Fedora/etc. &lt;/h3&gt;

&lt;p&gt; Actually you can, the full list of groups in yum is taken from all the enabled repositories. This means that if you create an empty repository with a group file in it, you can create your own groups! Just like tagging your own URLs in del.icio.us&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:5218</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/5218.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=5218"/>
    <title>Understanding why YUM seems "slow", and some instant solutions</title>
    <published>2008-05-21T07:39:18Z</published>
    <updated>2010-01-22T22:51:10Z</updated>
    <category term="performance"/>
    <category term="yum"/>
    <category term="benchmarks"/>
    <content type="html">&lt;p&gt; The &lt;a href="http://illiterat.livejournal.com/5043.html" rel="nofollow"&gt;previous entry&lt;/a&gt; was more of a general explanation about YUM speed and that if used in it's normal environment how current YUM is often more than fast enough, this entry is going to be a companion to it but focus on specific things that might make YUM appear slow but would better be described as using it suboptimally.
&lt;/p&gt;

&lt;p&gt; Note that I'm still not going to directly compare to other tools as I'm not as familiar with them and, as I said in the previous article, the tools are designed so differently that they don't lend themselves to comparisons. Also, as I also said before, if something is fine if it takes less than 10 seconds if two tools take 6 seconds and 4 seconds it doesn't really matter if yum is the faster (again, see the previous post, this should not be the end goal IMO).
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;ul&gt;

&lt;li&gt; &lt;h3&gt;Benchmarking "makecache":&lt;/h3&gt;&lt;p&gt; If you are benchmarking the makecache command in yum, you have utterly failed. This command &lt;b&gt;should be entirely a network test&lt;/b&gt;. If you have repos. that you care about which are shipping with only .xml data and not also .sqlite data (generated from: createrepo -d) then complain to whoever is generating the data. Nothing YUM can do will make this faster than 0 seconds (it also tends to be smaller, and so comes down the network sooner).&lt;/p&gt; &lt;/li&gt;

&lt;li&gt; &lt;h3&gt;Metadata checks slowing you down:&lt;/h3&gt; &lt;p&gt;Yes, YUM will automatically check that your metadata is current every 90 minutes, you can alter this using the metadata_expire configuration variable but the better thing to do is run some non-interactive process which will do the check every hour (yum-updatesd can be configured to just do this, but "yum list updates" in a cron job will also work). &lt;/p&gt;&lt;/li&gt;

&lt;li&gt; &lt;h3&gt;One big repo. hurts performance:&lt;/h3&gt; &lt;p&gt;As I've said YUM is mostly tested with Fedora/CentOS/etc. all of which use $basearch variables to have different architecture packages in different places, and also to have the set of three repositories "foo, foo-source and foo-debuginfo" instead of a single "foo" repository. This not only means that YUM doesn't have to inspect the metadata for the source packages in normal operations but that it doesn't even need to download the metadata for them over the network. The YUM related tools, like yumdownloader --source and debuginfo-install will enable the relevant repos. if you keep to the above naming convention.&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt; &lt;h3&gt;Some exotic features are just painful:&lt;/h3&gt; &lt;p&gt;There are still some corner cases, for instance using the "includepkgs" configuration option still requires that YUM load all of the repo. data into memory whereas using "exclude" configuration option only requires inspecting the metadata for packages which match. This may get better over time, but is something to be aware of if performance is not what you'd hoped for.&lt;/p&gt;
&lt;/li&gt;

&lt;li&gt; &lt;h3&gt;Using a newer YUM in Fedora 7/8:&lt;/h3&gt; &lt;p&gt;If you want to try the latest YUM in Fedora 7 or 8, for the speed gains or for the extra features, you can do the following:&lt;/p&gt;

&lt;pre&gt;
% yum install pygpgme
% yum install --enablerepo=development yum yum-utils python-urlgrabber
&lt;/pre&gt;

&lt;p&gt;...you should still review what the second command above wants to do, just to make sure it doesn't want to install 100s of packages in the later case, however I've used the above on multiple machines, on multiple architectures and on Fedora 7 and 8 ... without any problems.
&lt;/li&gt;

&lt;/ul&gt;
&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:5043</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/5043.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=5043"/>
    <title>YUM resource usage, an accurate assessment</title>
    <published>2008-05-21T06:58:33Z</published>
    <updated>2008-07-03T20:10:35Z</updated>
    <category term="performance"/>
    <category term="yum"/>
    <category term="benchmarks"/>
    <content type="html">&lt;p&gt; As a &lt;a href="http://linux.duke.edu/projects/yum/" rel="nofollow"&gt;YUM&lt;/a&gt; developer it's hard not to notice the number of complaints/attacks on YUM about it being slow and/or using too much memory, the most common thing about almost all of these complaints though is how inaccurate they are. To be clear, YUM did used to take a lot more time than was required to do simple operations and there are very likely some more improvements that can be made for &lt;a href="https://lists.dulug.duke.edu/pipermail/yum-devel/2008-May/005244.html" rel="nofollow"&gt;certain situations&lt;/a&gt; &lt;b&gt;BUT&lt;/b&gt; yum is currently "not slow" at most normal operations for Fedora users and RHEL/CentOS users should be getting a much newer/faster variant than they currently have in the very near future.&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; The most common misleading piece of data is to directly compare YUM to &lt;a href="http://apt-rpm.org/" rel="nofollow"&gt;APT&lt;/a&gt;, &lt;a href="http://labix.org/smart/" rel="nofollow"&gt;smart&lt;/a&gt; or &lt;a href="http://en.opensuse.org/Zypper" rel="nofollow"&gt;Zypper&lt;/a&gt;. The most obvious problem is that the commands/interfaces do not directly map between applications, and so hard to compare. For instance, at the simplest level, "yum install" accepts package names, "provides" or filenames contained within a package all of which can have wildcards and/or some of epoch/version/release/arch. But the problem goes even deeper than that, as the core assumptions each application makes are very different and so hard to compare. For instance, yum will automatically update it's metadata for the configured repositories but smart and apt both require the user does this manually. Or that yum/rpm is assumed to be used in an environment that has dependencies which use a combination of package names, versioned PRCO (Provides, Requires, Conflicts and Obsoletes) &lt;b&gt;and explicit file dependencies&lt;/b&gt;.&lt;/p&gt;

&lt;p&gt; In short it's fair to say that each package management tool has a close relationship with the main distribution(s) it is developed for, if for no other reason than that's where the developers tend to come from but also in more than one instance in the last 6 months specific cases of how new YUM features would work were defined by Fedora and not just the YUM developers. &lt;/p&gt;

&lt;p&gt; Which brings us to YUM resources usage within Fedora/etc. which, as I said initially, &lt;b&gt;in older versions&lt;/b&gt; has been a real and valid complaint against YUM. However the situation has been &lt;a href="http://people.redhat.com/jantill/yum/benchmarks" rel="nofollow"&gt;much improved&lt;/a&gt; over the last year. The use cases for which current YUM can still be "significantly" slower than we'd like is vanishingly small. Obviously Fedora 7 and 8, as well as RHEL/CentOS 5 will be slower to pickup the speed enhancements that have been made since 3.2.8, however this is not a failing of YUM more a consequence of the longer release periods associated with those distributions. &lt;/p&gt;
&lt;a name='cutid1-end'&gt;&lt;/a&gt;

&lt;p&gt; As a final note I'd say that the biggest reason a lot of these recent complaints annoy me is not just that they are inaccurate but that they tend to grossly mislead the reader into thinking that "package management" is a solved problem and the biggest obstacle to solve is whether "install foo" takes 3 seconds or 6 seconds. In my opinion this could not be further from the truth, managing 2-6 machines is an ugly problem in all the current solutions and managing 100-10,000 is basically not done. Then there are things like &lt;a href="http://illiterat.livejournal.com/3568.html" rel="nofollow"&gt;the 10x10 problem&lt;/a&gt;, where we are only starting to see ideas like &lt;a href="http://fedoraproject.org/wiki/JesseKeating/KojiPersonalRepos" rel="nofollow"&gt;KOPERS&lt;/a&gt; which might help solve the problem (but will require changes in the way package management is assumed to work). And that's completely ignoring the problems that have had attempted solutions (that failed) multiple times, like "rollback" support.
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:4615</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/4615.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=4615"/>
    <title>Python uses too much memory?</title>
    <published>2008-05-21T05:25:48Z</published>
    <updated>2008-07-03T20:10:09Z</updated>
    <category term="performance"/>
    <category term="python"/>
    <category term="yum"/>
    <category term="fedora"/>
    <content type="html">&lt;p&gt; There's a common attack leveled against &lt;a href="http://www.python.org/" rel="nofollow"&gt;Python&lt;/a&gt; applications that they take up too much memory, by people who understand the language difference (against, say, C) and by people just looking at their process list. This is often esp. evident on the newer x86_64 computers.
&lt;/p&gt;

&lt;p&gt; So as the Fedora Python maintainer, and a yum developer (an application written mostly using python), I figured it was probably worth investigating what the difference really was.
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;

&lt;p&gt; First I wrote a simple program which just created new
yum.YumBase() objects and appended them to a list (numbers got from
parsing /proc/self/status) which gave the following results:
&lt;/p&gt;

&lt;pre&gt;
.x86_64     0 peak 219.90MB size 219.90MB rss  13.30MB
.x86_64     1 peak 219.90MB size 219.90MB rss  13.33MB
.x86_64 90001 peak 610.46MB size 610.46MB rss 403.75MB

.i386       0 peak  20.65MB size  20.65MB rss   9.61MB
.i386       1 peak  20.65MB size  20.65MB rss   9.63MB
.i386   90001 peak 212.77MB size 212.77MB rss 201.82MB
&lt;/pre&gt;

&lt;p&gt;
...which seems pretty damning of python on .x86_64 and/or yum, 2x for RSS and much more for VSZ (10x to start with above, which is obviously &lt;b&gt;a lot&lt;/b&gt;). So then I added a "pmap" call right at the end, to find out where that allocated memory was going, the most interesting pieces of data being:&lt;/p&gt;

&lt;pre&gt;
0000000000601000 449696K rw---    [ anon ]
[...]
00002aaaaab5a000  76136K r----  /usr/lib/locale/locale-archive
[...]
00002aaaafa8d000     20K r-x--  /usr/lib64/python2.5/lib-dynload/stropmodule.so
00002aaaafa92000   2044K -----  /usr/lib64/python2.5/lib-dynload/stropmodule.so
00002aaaafc91000      8K rw---  /usr/lib64/python2.5/lib-dynload/stropmodule.so
&lt;/pre&gt;

&lt;p&gt;...on .x86_64, and taking single shared object as an example vs on .i386:&lt;/p&gt;

&lt;pre&gt;
00c58000             16K r-x--  /usr/lib/python2.5/lib-dynload/stropmodule.so
00c5c000              8K rwx--  /usr/lib/python2.5/lib-dynload/stropmodule.so
[...]
09290000         222296K rwx--    [ anon ]
[...]
b7d23000           2048K r----  /usr/lib/locale/locale-archive
&lt;/pre&gt;

&lt;p&gt;...as you can see the shared library has a &lt;b&gt;2MB hole&lt;/b&gt; in the middle of it, which is counted towards it's VSZ even though it is not writable, executable or readable (and so I'd assume is &lt;b&gt;not using any real memory&lt;/b&gt;). This basically means that VSZ is worthless on .x86_64, and is just even more worthless for python programs because they tend to load more shared objects.&lt;/p&gt;

&lt;p&gt; The locale archive being &lt;b&gt;38 times bigger&lt;/b&gt; is explained by these lines from glibc/locale/loadarchive.c:&lt;/p&gt;
&lt;pre&gt;
      /* Map an initial window probably large enough to cover the header
         and the first locale's data.  With a large address space, we can
         just map the whole file and be sure everything is covered.  */

      mapsize = (sizeof (void *) &amp;gt; 4 ? archive_stat.st_size
                 : MIN (archive_stat.st_size, ARCHIVE_MAPPING_WINDOW));

      result = __mmap64 (NULL, mapsize, PROT_READ, MAP_FILE|MAP_COPY, fd, 0);
&lt;/pre&gt;

&lt;p&gt;...which means any program that uses the C locale functions gets an extra ~73MB of VSZ at
startup on .x86_64.&lt;/p&gt;

&lt;p&gt; The next interesting part of the data from pmap is that there are roughly 24 "anonymous" mappings
for .x86_64 and only 20 for .i386, a little investigation shows that glibc is again the reason as the default value for M_MMAP_THRESHOLD (basically when glibc creates new entries for data, instead of reusing old ones) doesn't expand with size_t/time_t/etc. (which are twice as big). You can see this by setting MALLOC_MMAP_MAX_=0 in the environment, before running your application and that will produce the same number of "anonymous" mappings on x86_64.&lt;/p&gt;

&lt;p&gt; And after taking into account all of the above, which is completely the domain of glibc and not python, the memory numbers add up as "simple doubling" as you go from 4 byte size_t/time_t/intptr_t/etc. to 8 bytes for the same.&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:4503</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/4503.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=4503"/>
    <title>I called it, Obama for democratic nominee/president</title>
    <published>2008-01-08T00:26:18Z</published>
    <updated>2008-01-08T00:26:58Z</updated>
    <content type="html">Feel free to pick me up and jiggle me about before asking other questions obviously answered by crappy human nature:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://extempore.livejournal.com/174568.html?thread=4078824#t4078824" rel="nofollow"&gt;Obama to win over Hillary, March 1st, 2007 05:07 pm&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:4324</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/4324.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=4324"/>
    <title>The myth of the "simple String API", that isn't really needed</title>
    <published>2007-10-04T05:12:55Z</published>
    <updated>2007-10-04T05:12:55Z</updated>
    <content type="html">&lt;p&gt; I saw three things recently which just confirmed that noone learns anything:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The "we don't need a real string API" on the &lt;a href="http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=58021" rel="nofollow"&gt;GIT&lt;/a&gt; mailing list, followed by almost a month now of struggling to implement another one (and wishing they had some of the &lt;a href="http://www.and.org/ustr/" rel="nofollow"&gt;ustr&lt;/a&gt; features).&lt;/li&gt;
&lt;li&gt;GLibc's &lt;a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1248.pdf" rel="nofollow"&gt;proposed fix&lt;/a&gt; for C/POSIX not having a usable string API. Yes, it's the old let's all pretend IO (FILE *) is a good string abstraction ... you need to add, search/etc, and then add some more? Well multiple copies were always good fun, pretty much guaranteeing people will do a bunch of workarounds if they use it at all. &lt;/li&gt;
&lt;li&gt;The Linux kernel is again looking for &lt;a href="http://lwn.net/Articles/251650/" rel="nofollow"&gt;yet another almost a worthwhile string API&lt;/a&gt;, pretending it'll only be useful for this specific IO type ... because seq_printf() wasn't unique enough.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; Best quote by far, IMO (from the GIT thread):&lt;/p&gt;

&lt;blockquote&gt;
 "Please, no.  Let's not pull in a dependency for something as simple as a
string library." -- Kristian Høgsberg
&lt;/blockquote&gt;

&lt;p&gt; As a minor good note, ustr is in Fedora and so pretty much everyone in Fedora land has access to a usable string API with a single -l ... and the newer versions of libsemanage (SELinux) use ustr. I'm not going to hold my breath for mass sanity to occur though.
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:3568</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/3568.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=3568"/>
    <title>Software packaging, and the 10 × 10 problem</title>
    <published>2007-06-20T06:16:59Z</published>
    <updated>2007-06-20T06:16:59Z</updated>
    <content type="html">&lt;p&gt; First off, I hate all current software packaging for Linux. It's one step up from manually downloading tarballs, which I was doing 10 years ago, and isn't even a real superset of that functionality. Yes, it's better than doing that and yes it's better than what Windows has to offer. But it's still crap, and it annoys me every day, however it seems to be one of those problems that noone seems to want to fix properly and yet they keep thinking up stupid bandaids to work around the fact it doesn't work well. Specifically the major circular argument that is what I've come to think of as the "10 × 10 problem"…
&lt;/p&gt;

&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; Repeatedly over the last 5+ years I've tried to tell people about how we should be solving this problem, and I've mostly got dismissed … so here it is for posterity.
&lt;/p&gt;

&lt;p&gt; One of the currently major problems with software packaging and distribution is updating it. All software has bugs, or missing features and those affect different people in different ways. Say you have two people, one who is writting a new web application in python using postgresql and one who is just browsing the web, sharing some files via. the web server and running his own blog. It probably seems obvious to most sane people that the updates these two people want will vary wildly for the five packages: firefox, python, ipython, postgresql, apache-httpd. That is, unless you are in the business of distributing software packages.
&lt;/p&gt;

&lt;p&gt; With software packaging/release/whatever the arugment is most often about who gets screwed over least, in the above example a common "solution" is to do very minor updates for firefox and maybe ipython but nothing else … want firefox-2, or the latest postgresql/python? Too bad. Wait for the next "major" release and get those with everything else. A common painful package is
the Linux kernel, mainly just because everyone uses it, and so &lt;a href="http://www.kroah.com/log/linux/enterprise_kernel_future.html" rel="nofollow"&gt;all the software packagers argue about how often it should/shouldn't be updated&lt;/a&gt;. This is roughly the same as arguing what single human language would be best for the entire OS, if everyone should use python or C to develop all applications or what font everyone should use for all text. There is &lt;b&gt;no correct answer&lt;/b&gt;, because the entire question is completely insane.
&lt;/p&gt;

&lt;p&gt; When I first thought about this, I was visiting/speaking with quite a few different Linux customers and I came up with the general idea of the "10 × 10 problem". The general statement of the problem is that you have 10 customers, each of which have 10 different "changes" they want for the next release. The current "solution" to the problem involves picking a number between 0 and 100, doing that amount of changes and giving the result to everyone so that you have a very small number of "releases" which can be managed by hand. This rarely makes anyone happy, and is basically what Microsoft; Sun and SCO have done for the last 10 years.
&lt;/p&gt;

&lt;p&gt; Obviously in the real world things are more complicated than the above, one of the most common differences is that each of the 10 customers has changes that are important enough they don't want to wait for "the next release" and they'll also want some of the changes you've done for the other people but also have desires like "apart those changes, I don't want anything else to change". This also destroys any hope of there ever being a magic number of changes you can do, that will actually make everyone happy. And yet, still we get arguments about what the best magic number is.
&lt;/p&gt;

&lt;p&gt; The "solution" to this problem is to admit defeat and stop managing software packaging and distribution like it's 1995, imagine for a moment that each "customer" had their own private software packaging and distribution team and used that instead of paying someone outside to do the work for them. In this senario it's very likely that all the high priority changes would be done immediately, and that all 10 changes would make it into "their" next release with no other changes. Now imagine that all those private teams communicated openly with each other, this would result in a similar outcome with the added benefit that significant changes done by one team that are desired by others would be merged into their releases.
&lt;/p&gt;

&lt;p&gt; The common complaint about why this "can't be done" by an external entity is that managing so many releases by hand is "hard" and would cost too much money. In my so humble opinion this is a bit like saying that managing more than one version of a C source file is hard and time consuming, as against implementing some decent SCM tools so that you can manage 100s of 1,000s of different version sets. The next common complaint is that the entire OS needs to be tested as one unit, which is mostly untrue and esp. so when you are talking about changes the average customer desires … and again, any truth in this is mostly a lack of decent tools.
&lt;/p&gt;

&lt;p&gt; The reason this could never have happened over the last 20 years is that the code itself was hidden from the customers, so they had no choice except to take what they were given by their software packaging and distribution company. This is not the case now, and I've already started to see the emergance of people implementing the good solution privately because they can't buy it from anyone. I'm just waiting for someone to give people what they want, and charge for it ... and I hope I work for the company that does it.
&lt;/p&gt;

&lt;p&gt; As a final optimistic note, I have helped the real solution come a tiny bit closer to reality. With Fedora 7 you can now install the yum-security package, and do things like "yum --security update -y" to get just security updates or "yum --bz 1234 update -y" to get just the updates which fix bugzilla.redhat.com#1234.
&lt;/p&gt;

&lt;a name='cutid1-end'&gt;&lt;/a&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:3272</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/3272.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=3272"/>
    <title>Fedora Kernel update causes: mkrootdev expected fs options</title>
    <published>2007-04-24T00:19:06Z</published>
    <updated>2007-04-24T00:20:19Z</updated>
    <content type="html">&lt;p&gt; I'm mainly dropping this somewhere so other people can find it, I recently did a kernel update and when I rebooted I got:&lt;/p&gt;
&lt;pre&gt;
mkrootdev expected fs options
mount: missing mount point
setuproot: moving /dev/failed no such file or directory
setuproot: error mounting /proc
setuproot: error mounting /sys
switchroot: mount failed No such file or directory
kernel panic
&lt;/pre&gt;

&lt;p&gt;...it turns out this was because I'd somehow got two / entries in /etc/fstab, as I found from &lt;a href="http://www.fedoraforum.org/forum/archive/index.php/t-137872.html" rel="nofollow"&gt;this posting&lt;/a&gt;. The only thing you need to be aware of is that you also have a copy of /etc/fstab in the initrd, so after fixing /etc/fstab you'll want to re-run the postinstall scriptlet (rpm -q --scripts kernel), Ie. /sbin/new-kernel-pkg --package kernel --mkinitrd --depmod --install &amp;lt;kernel version&amp;gt;
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:2989</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/2989.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=2989"/>
    <title>lighttpd's "AIO sendfile"</title>
    <published>2006-11-14T22:01:22Z</published>
    <updated>2006-11-14T22:03:28Z</updated>
    <content type="html">&lt;p&gt; I keep track of a couple of the other web servers out there, even though I don't think much of them ... for various reasons. But when I saw that &lt;a href="http://blog.lighttpd.net/articles/2006/11/14/pre-release-lighttpd-1-5-0-r1435-tar-gz" rel="nofollow"&gt;lighttpd was advertising AIO sendfile&lt;/a&gt; for their 1.5.x I was pretty impressed, while I hadn't been following it closely I didn't think the &lt;a href="http://marc.theaimsgroup.com/?l=linux-netdev&amp;amp;m=114649980329781&amp;amp;w=2" rel="nofollow"&gt;Linux AIO sendfile code&lt;/a&gt; had gone in yet (I'd been planning on trying to use the splice() API with IO helpers to do the same thing ... but I keep not getting around to it).
&lt;/p&gt;
&lt;p&gt; Alas. looking deeper into it, &lt;a href="http://blog.lighttpd.net/articles/2006/11/12/lighty-1-5-0-and-linux-aio" rel="nofollow"&gt;the release note is very misleading&lt;/a&gt;. They are actually doing AIO reads into allocated memory, and then using non-AIO sendfile() on the fd associated with the memory. Of course, this means:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; It copies the entire file into memory (in sections of half a MB). &lt;/li&gt;
&lt;li&gt; It requires copying an entire file section before anything is sent. &lt;/li&gt;
&lt;li&gt; There is no sharing! So if you have 10,000 connections requesting the same 2MB file you get 5GB of memory used. &lt;/li&gt;
&lt;li&gt; The real page cache readahead is going to have to be that much more intelligent to do the right thing. &lt;/li&gt;
&lt;li&gt; There is no guarantee that the memory isn't swapped! So you can still have all the blocking artifacts of normal non-AIO sendfile(). &lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;...basically this is a lot of work to reimplement the page cache, badly. &lt;/p&gt;

&lt;p&gt; I also seriously question whether the same "improvement" could have been got by just over allocating on the process to CPU mapping (something that I understand lighttpd can do as well as and-httpd). It also implies that posix_fadvise() should be doing more work (it is supposed to solve this exact problem), but maybe sprinkling explicit readhead() calls would also work around the problem (without screwing everyone else on the box). &lt;/p&gt;

&lt;p&gt; Also why lighttpd uses sendfile() instead of just write() for the last piece, I'm not sure ... it's like doing a mmap() and then calling sendfile() on the mmap'd fd.
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:2753</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/2753.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=2753"/>
    <title>Re(tar)ddit.com, newegg and others ignoring HTTP/HTML</title>
    <published>2006-10-21T05:24:14Z</published>
    <updated>2006-10-21T05:39:35Z</updated>
    <content type="html">&lt;p&gt;
 So I tried &lt;a href="http://reddit.com/" rel="nofollow"&gt;reddit.com&lt;/a&gt; about a year ago, mainly due to the whole &lt;a href="http://ycombinator.com/about.html" rel="nofollow"&gt;Y combinator&lt;/a&gt; thing. It seemed like it might be cute, but at the time didn't seem to contain any decent content on the front page &lt;b&gt;and it required JavaScript&lt;/b&gt; to do anything interactive. So, I figured I'd come back a bit later when they've had a chance to polish it and take another look.
&lt;/p&gt;

&lt;p&gt; As you might guess, it's &lt;b&gt;still JS only&lt;/b&gt; and they almost seem proud of it. Like they get more Web-2.0 juice or something. Hello 1995 called to let you know links are possible without JS. They even have "&lt;a href="http://reddit.com/buttons" rel="nofollow"&gt;buttons&lt;/a&gt;" that you can attach to your site, so people can vote up/down your content easily ... but the "button" is just a script tag. Yeh, so not only do I apparently want to leave all my non-JS enabled agents out in the cold ... I want to actively mix-in JS from a third party? It's like the blind being led by the retards.
&lt;/p&gt;

&lt;p&gt; I could almost understand, the "JS is almost everywhere" so we'll just ignore everyone who wants security/stability/speed kind of argument for reddit.com ... if the rest of their usage of HTTP/HTML didn't seem like it was written by a 13yr old. For instance &lt;a href="http://reddit.com/static/reddit.js" rel="nofollow"&gt;http://reddit.com/static/reddit.js&lt;/a&gt; is their "main" blob of imported JS functions. Now, given it says static in the URL, you might imagine that this library code would be heavily optimized with all the latest HTTP bits possible (in fact I was surprised there wasn't a version/date in the URL). But no, there's no Expires/Cache-Control headers. At least it has an ETag and does Content-Encoding (although lighttpd is broken and ignores it for HEAD requests).
&lt;/p&gt;

&lt;p&gt; But, reddit.com is still somewhat easy to dismiss as crack smokin' 13yr olds who still need a few years to grow up. After all it's not a real business, and probably doesn't make worthwhile amounts of money. But, then, newegg.com does ... and there almost everything works without JS, the notable exception being actually doing the sale. Sure, let me look at stuff and even do searches with a secure browser ... but when I need to type in my credit card, that's the time to rely on JS to do 1998 style forms.&lt;/p&gt;

&lt;p&gt; Unsurprisingly &lt;a href="http://amazon.com/" rel="nofollow"&gt;amazon.com&lt;/a&gt;, as the largest online retailer, have consistently had the best UI for non-JS user-agents. And even though tags, as their latest tweak, started off as JS only then moved to working perfectly without JS within a couple of weeks. But, I guess maybe it's got less 2.0 juice now.
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:2417</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/2417.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=2417"/>
    <title>HTTP for desktop applications</title>
    <published>2006-02-07T23:41:11Z</published>
    <updated>2006-02-07T23:43:23Z</updated>
    <content type="html">&lt;p&gt;So to continue the HTTP theme, Miguel's idea that
&lt;a href="http://tirania.org/blog/archive/2005/Nov-26-2.html" rel="nofollow"&gt;desktop applications communicate via. HTTP&lt;/a&gt; seems completely insane. I can only presume Miguel was not involved at all in the Mono HTTP API implementations. As part of writing my webserver, I've already written about &lt;a href="http://www.and.org/texts/server-http" rel="nofollow"&gt;how terrible the HTTP spec. is&lt;/a&gt; ... hell apache-httpd blatantly ignores significant parts of it.&lt;/p&gt;
&lt;a name="cutid1"&gt;&lt;/a&gt;
&lt;p&gt; It's not like there's even a hope that by "HTTP" Miguel meant "METHOD &amp;lt;path&amp;gt; &amp;lt;version&amp;gt;\r\nHeaders: value\r\n\r\n" ... much like people often do when they talk about "XML" but really mean "something in readable text deliniated by angle brackets". Because he want's to be able to "reuse" existing implementations. So, here's my explanations of the list of "benefits" to using HTTP:&lt;/p&gt;

&lt;dl&gt;
&lt;dt&gt;HTTP is a well known protocol&lt;/dt&gt;
&lt;dd&gt;HTTP is a well mis-understood protocol, it has a &lt;b&gt;huge&lt;/b&gt; array of parsing pitfalls. And that doesn't even include the number of things people screw up, that you should work around if you want to be nice. Basically all HTTPDs have had security bugs in their HTTP parsers, and-httpd is a big &lt;a href="http://www.and.org/and-httpd/#security-guarantee" rel="nofollow"&gt;exception&lt;/a&gt; here ... and even then I'd never suggest I have implemented HTTP bug free.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;There are plenty of HTTP client and server implementations that can be employed&lt;/dt&gt;
&lt;dd&gt;There are plenty of broken client/server libraries, and there isn't even a well defined way to communicate "here's a list of stuff" without resorting to complete insanity like XMLRPC or WebDAV. Also hooking anything into more than one server basically means writing CGI (or, possibly FastCGI) or doing multiple implementations.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;The protocol can be as simple or as complex as needed by the applications. From passing all the arguments on a REST header as arguments to the tool-aware SOAP&lt;/dt&gt;
&lt;dd&gt;Riiight. Show me a single HTTP client that does SOAP requests which degrade to REST. You basically have to pick one, so you go from clients needing TCP+MYPROTO to TCP+HTTP+REST+MYPROTO ... and this is better? ... for something 99.999% of the time is going to be machine local?&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;HTTP works well with large volumes of data and large numbers of clients&lt;/dt&gt;
&lt;dd&gt;hahahaha. Apache dies easily with small numbers of clients, and has only got LFS support in 2.2.0.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;Scaling HTTP is a well understood problem&lt;/dt&gt;
&lt;dd&gt;Maybe ... but it's not &lt;b&gt;easy&lt;/b&gt;. And the scaling currently is all to do with "browser HTTP", Ie. GET is basically the only method used ... if GNOME introduces random methods for different things in the desktop most of the current scaling knowledge goes --&amp;gt; that way.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;Users can start with a REST API, but can easily move to SOAP if they need to&lt;/dt&gt;
&lt;dd&gt;This is a repeat of problem/benefit #3 ... again implying that having random crap at the end of the HTTP transport is any more useful than having a MS word file containing a single element with base64 encoded data in it.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;HTTP includes format negotiations: clients can request a number of different formats and the server can send the best possible match, it is already part of the protocol. This colors the request with a second dimension if they choose to.&lt;/dt&gt;
&lt;dd&gt;Sure, and how many HTTP servers/services currently implement this? I've not seen a CGI that does, lighttpd can't do it, apache-httpd has &lt;a href="http://httpd.apache.org/docs/2.2/content-negotiation.html" rel="nofollow"&gt;.var files&lt;/a&gt; (which are unmaintianable IMO) and "multiviews" for normal people (which you shouldn't enable as then performance becomes horrible). There's also the problem that most of the browsers don't implement it well either (Ie. firefox "prefers" XML and XHTML over HTML, even though it renders HTML better (and faster) ... hell, it doesn't even render link tags in XML) so it's not exactly been tested a lot in the field.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;Servers can take advantage of HTTP frameworks to implement their functionality on the desktop&lt;/dt&gt;
&lt;dd&gt;I think this is a "free remote desktop" play, now with caching and goes through firewalls! Because that's the only reason people aren't running remote apps. on their home machine, at work.&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;It is not another dependency on an obscure Linux library&lt;/dt&gt;
&lt;dd&gt;Yeh, I'm sure most people know what libcurl or libneon are ... and anything else that actually solves the problem is going to be shipped by everyone that cares. Plus you'll need something better than "just use HTTP and link with curl" ... which will be in it's own "obscure Linux library".&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;The possibility of easily capturing, proxying and caching any requests&lt;/dt&gt;
&lt;dd&gt;Doing captures, is a benefit ... and I can almost imagine designing a protocol over HTTP for that (of course nothing will decode the bit over HTTP ... but it gets you some way there). Proxying is just another battle in the users vs. IT dept. war ... and I can't see the users winning that one. Caching is laughable, most web apps. are terrible at caching ... and most of the simple "HTTP clients" like robots and Atom readers aren't implementing Accept-Encoding/If-None-Match/If-Unmodified-Since (in spite of god knowns how many words on the subject).&lt;/dd&gt;

&lt;br /&gt;

&lt;dt&gt;Redirects could be used to redirect request to remote hosts transparently and securily&lt;/dt&gt;
&lt;dd&gt;Maybe, although I don't see what is secure about it. And, again, most of the clients already have problems differentiating between 301, 302, 303 and 307 ... AFAICS firefox does the same thing for all of them.&lt;/dd&gt;
&lt;/dl&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:2277</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/2277.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=2277"/>
    <title>Design by committee</title>
    <published>2006-02-07T22:44:33Z</published>
    <updated>2006-02-07T22:45:57Z</updated>
    <content type="html">&lt;a href="http://nat.org/2006/february/#Dan-Winship-on-design-by-committee" rel="nofollow"&gt;So Nat says:&lt;/a&gt;
&lt;blockquote&gt; &lt;p&gt; Today &lt;a href="http://mail.gnome.org/archives/desktop-devel-list/2006-February/msg00115.html" rel="nofollow"&gt;Dan Winship&lt;/a&gt; wrote a wonderful mail about the perils of designing software by community process.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt; So, I have to wonder, does Nat/Dan/Miguel think that Apache-httpd is "bad"? My feeling is that it's a brilliant example of how committee designed software is terrible, hell the fact the config. parser let's each module parse it's own syntax seems like letting each gedit plugin have it's own GUI theme. And the way the module stuff is done reaks of "vision by committee", in that everyone just does their thing in their module so they don't have to speak to everyone else.
&lt;/p&gt;
&lt;p&gt; I also think this is somewhat interesting, as it also supports something I've believed for a long time ... Ie. design/security/quality isn't all it's espoused to be. It's nice, and everyone is happy to say so but very few are willing to pay for it (either with money, or even in time/work to move from something else). Compatability, speed and dancing monkeys all pay a greater roll.&lt;/p&gt;&lt;p&gt; And to bring the argument back to the desktop, I even feel the same way ... I don't care how much quality design they've put into GNOME, it's all (and more) canceled out when they break focus follows mouse or infinite space on my panel buttons.
&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:illiterat:1986</id>
    <link rel="alternate" type="text/html" href="http://illiterat.livejournal.com/1986.html"/>
    <link rel="self" type="text/xml" href="http://illiterat.livejournal.com/data/atom/?itemid=1986"/>
    <title>And-httpd and security</title>
    <published>2005-09-22T17:21:55Z</published>
    <updated>2005-09-22T17:21:55Z</updated>
    <content type="html">&lt;p&gt;Well after using, what is now, And-httpd personally for many months, I've finally released an official version, seperate from Vstr. I've also backed it up with a &lt;a href="http://www.and.org/and-httpd#secure-guarantee" rel="nofollow"&gt;$500 "security guarantee"&lt;/a&gt;, and I'm not sure if that's stupid or "insightful", I guess time will tell. There are scarily huge amount of &lt;a href="http://freshmeat.net/browse/250/" rel="nofollow"&gt;web servers&lt;/a&gt; at freshmeat ... which makes me wonder why apache-httpd is still so popular, but I guess quality is different from quantity ... and most Linux distributions are reluctant to ship more than one of anything, and given you have to ship apache-httpd that kind of settles it. *sigh*.&lt;/p&gt;

&lt;p&gt;Anyway, after releasing promises of money in return for security bugs (which I'm assuming I won't have to pay, but then I think professional poker players assume they'll win too) I'm going for a long weekend away from a computer. So don't feel too depsondent when I don't respond to messages that I'm stupid, and almost certainly $500 poorer :).&lt;/p&gt;</content>
  </entry>
</feed>
