James Antill - Post a comment
Oct. 8th, 2009
I've now seen a few requests for yum which can be generalized as "work better with multiple versions of packages". So I decided to write this, instead of replying N times to all the different requests.
The simple feature: just do what rpm does
Since it's inception rpm differentiated between an "install" and an "update", allowing you to install X-1-1 and X-2-1 as long as neither package provided the same filename as the other with different content. At the rpm layer this is mostly an easy problem to solve, if the user wants to install instead of update you just check all the files in both packages and as long as there isn't a conflict you install the new package but don't remove the old one.
As I said this mostly works well, you can update some packages and install/remove the other. However one problem you'll quickly run into is that if you have both X-1-1 and X-2-1 installed, but these now need to be updated by X-1-2 and X-2-2. But there's no easy way to do this, as there's no good way to tell rpm "I update just X-1-* (or X-2-*) packages". This is a core problem for yum, if you have multiple versions of a package installed what does "yum update" do.
Kernel: The special case
The kernel is treated differently from every other package, mainly because you can't move to the new version without a reboot, the running version needs to be able to access it's modules and if you only have one version and it breaks the machine doesn't boot anymore. So yum has infrastructure setup to install multiple versions of packages, which is used/tested just by the kernel (although you can give that ability to any package).
However the kernel has a few positive features that are in it's favour. The two main ones being that each kernel package has files that are 100% unique to it and that no other package "uses" the data within the kernel package (so nothing needs to answer the question "which of the N versions available should I use").
This works fine for the kernel, and has an obvious answer to the question "what does yum update do" ... if there is a newer version of the package than the one installed, we install it, and if the installonly_limit has been reached for this package we remove the oldest one (that isn't the running kernel, and hasn't been manually marked to be kept in the yumdb).
This simple policy wouldn't work for anything other packages though (like the example above), so we either need to change this policy (and make sure the kernel package keeps working) or try something else...
Embed the version in the package name
This is the next obvious solution, and is used quite frequently. You still need to make sure neither package has files which conflict with the other, and there are usability problems to do with the fact the package X is now called X1 and/or X2. But this, at least, can work and allows you to have X1-1-1 be upgraded by X1-1-2 and X2-2-1 be upgraded by X2-2-2.
The problems with doing this usually stem from the fact the packages are not isolated, and they will often provide some of the same things (so even though their names are different, they can be referred to by a common provide name). This is esp. a problem with rpm autoprovides. It then becomes problematic to test, because instead of one test case (install package X) to see if it works, you now have three testcases (install X1, install X2 and install X1 and X2). This testing complexity rises as you have more versioned packages, X3 is 7, X4 is 15!
The good thing about this method (from a yum developers point of view) is that nothing needs to be done on the package manager side. Each package is completely independent, so if normal packages install then versioned packages install.
Taking a look an example, python2 and python3 packages
For a concrete example consider having python2 and python3 (as of 2009 that is desirable because of the backwards incompatibility snafu with python3), we can "easily" create a python3 package that is installable alongside the current default of python (== python-2.6/python-2.7/whatever). There are some auto-provides, but we could either remove/change them on the python3 package or make sure nothing else is using them. Users can install both and run either /usr/bin/python or /usr/bin/python3, all seems good.
If this was enough, you can now rest ... until the first time some random package decides it'd like to use python3 instead of python, and you get a huge number of complaints that users now have two versions of python wasting disk/bandwidth/etc.
However core python is very rarely enough, for any GUI application you'll need python3 bindings to gtk and for anything a bit more complex (like yum) you'll need bindings to lots of C APIs (rpm, xml, gpg and curl to name a few of yum's requirements). This means you've gone to a lot of work for something that isn't very useful, or you have to create versioned packages for a significant part of the packages that touch python ... and then you have to test that (which, remember, is 3 times the work).
And you have to maintain that, testing updates and doing security errata (3x the work), and deal with bugs/complaints from people saying that package FOO is only available in version A and they need it in the other version ... or logging bugs against python-blah instead of python-blah3. Of course humans are lazy, so to get around doing some of this work someone will decide it's a good idea to have a single python-blah package that works with either/both versions of python ... and then you'll have more problems.
And then at the end you still have the usability problems where the user installs "python-blah" but really wanted to install "python3-blah" not understanding why things don't work.