You are viewing illiterat

James Antill - YUM resource usage, an accurate assessment

May. 20th, 2008

10:00 pm - YUM resource usage, an accurate assessment

Previous Entry Add to Memories Share Next Entry

As a YUM developer it's hard not to notice the number of complaints/attacks on YUM about it being slow and/or using too much memory, the most common thing about almost all of these complaints though is how inaccurate they are. To be clear, YUM did used to take a lot more time than was required to do simple operations and there are very likely some more improvements that can be made for certain situations BUT yum is currently "not slow" at most normal operations for Fedora users and RHEL/CentOS users should be getting a much newer/faster variant than they currently have in the very near future.

The most common misleading piece of data is to directly compare YUM to APT, smart or Zypper. The most obvious problem is that the commands/interfaces do not directly map between applications, and so hard to compare. For instance, at the simplest level, "yum install" accepts package names, "provides" or filenames contained within a package all of which can have wildcards and/or some of epoch/version/release/arch. But the problem goes even deeper than that, as the core assumptions each application makes are very different and so hard to compare. For instance, yum will automatically update it's metadata for the configured repositories but smart and apt both require the user does this manually. Or that yum/rpm is assumed to be used in an environment that has dependencies which use a combination of package names, versioned PRCO (Provides, Requires, Conflicts and Obsoletes) and explicit file dependencies.

In short it's fair to say that each package management tool has a close relationship with the main distribution(s) it is developed for, if for no other reason than that's where the developers tend to come from but also in more than one instance in the last 6 months specific cases of how new YUM features would work were defined by Fedora and not just the YUM developers.

Which brings us to YUM resources usage within Fedora/etc. which, as I said initially, in older versions has been a real and valid complaint against YUM. However the situation has been much improved over the last year. The use cases for which current YUM can still be "significantly" slower than we'd like is vanishingly small. Obviously Fedora 7 and 8, as well as RHEL/CentOS 5 will be slower to pickup the speed enhancements that have been made since 3.2.8, however this is not a failing of YUM more a consequence of the longer release periods associated with those distributions.

As a final note I'd say that the biggest reason a lot of these recent complaints annoy me is not just that they are inaccurate but that they tend to grossly mislead the reader into thinking that "package management" is a solved problem and the biggest obstacle to solve is whether "install foo" takes 3 seconds or 6 seconds. In my opinion this could not be further from the truth, managing 2-6 machines is an ugly problem in all the current solutions and managing 100-10,000 is basically not done. Then there are things like the 10x10 problem, where we are only starting to see ideas like KOPERS which might help solve the problem (but will require changes in the way package management is assumed to work). And that's completely ignoring the problems that have had attempted solutions (that failed) multiple times, like "rollback" support.

Comments:

From:timlau.myopenid.com
Date:May 21st, 2008 07:41 am (UTC)
(Link)
Very well written James, just describes how i feel too. :)
From:(Anonymous)
Date:May 21st, 2008 02:57 pm (UTC)

I love YUM but .....

(Link)
I've been a long time user of Redhat and RPM in general and I appreciate the work the package management system does for me - if I can avoid doing tar-balls I do.

However, I think you might be missing the point a bit. Criticism is always hard - in particular if the word is "not good enough" after you've put in a great effort to make things better. Of course things can always be improved, but I don't think that's the point.

My observation - as the total outsider to this - is that if the product takes time because of it's complexity it may be time to reduce that complexity. An example could be, how often is the wild-card functionality used - versus a simple "give me this package" install in yum? If we're talking 90-10 - why not simply have two versions - a simple and an advanced. The simple for most people, while the rare system admin would use the advanced (slower but more complete) version?

In regards to your last comment, that is definitely an area that I'm faced with and would like to see solved. I think the approach would be to do image handling instead of package handling. Setup a primary image - use the package manager there, but then simply move the changed files over with "patch" type functionality. If I have 100 Centos5 boxes, I need the /usr and most other areas the RPMs will touch to be in sync. Of course there are exceptions and those would be handled by the package management. It's a total diversion from basically having the same logic being executed for each host but in large environments I think that's our only way out?
[User Picture]
From:illiterat
Date:May 21st, 2008 03:37 pm (UTC)

Re: I love YUM but .....

(Link)
However, I think you might be missing the point a bit. Criticism is always hard - in particular if the word is "not good enough" after you've put in a great effort to make things better. Of course things can always be improved, but I don't think that's the point.

Well I tried to make sure I mentioned it in the article, and I did link to the performance tests on different yum versions, but yes we know that older versions did cross the threshold from "fast enough" to "not fast enough" for various reasons. However current YUM code (I'm using 3.2.16 in Fedora 9) does "simple queries" in less than 2 seconds and "simple installs" in about 6 seconds (but note that you'll often need pacakges to be downloaded, and even if not RPM will need to install the packages -- which will add significantly to this) on the other end a full Fedora 8 to Fedora 9 update takes less than a minute within YUM (40ish seconds, IIRC).

I appreciate that CentOS 5 is still on 3.0.x, so anyone using that is going to have a radically different experience. However I assume the 3.2.8++ based version will be available "soon", given that it was released by Red Hat today. And I'd heard they'd been thinking of putting a 3.2.16 version into centosplus.

So my point was that, while those numbers could get smaller (and for all I know apt/zypper/etc. could be better in all cases) it is not the most beneficial goal to have the YUM part of install go from 6 seconds to 2 seconds.