Log in

No account? Create an account

Programming with Yum in 5 minutes, or so - James Antill — LiveJournal

Aug. 29th, 2008

08:49 pm - Programming with Yum in 5 minutes, or so

Previous Entry Share Next Entry

There are a lot of lines of code in yum, and it can be somewhat intimidating at first glace. However a significant amount of effort has been made to make simple things easy, and the hard things not so hard. The start of any code using yum, will almost certainly have these four lines (and always the first and last one :).

    1 #! /usr/bin/python -tt
    3 import os
    4 import sys
    5 import yum

Those lines just tell python, you'd you'd like to be able to use the yum code, and some stuff for the OS. Next the first bit of real code, and something which is also in almost every piece of code using yum:

    7 yb = yum.YumBase()

This creates a yum instance, that you can work with. Then one more piece, that is very useful:

    9 yb.conf.cache = os.geteuid() != 0

This just tells the yum instance not to try and update any of it's data, as the caller of the script probably hasn't got the permissions to do so.

Now we have a real yum object that we can do things with, the three most useful parts to access are pkgSack, rpmdb and repos. The first two basically act the same, but rpmdb performs queries based on the installed packages on the local machine and pkgSack performs them against all the enabled (normally remote) repositories. The repos attribute is almost always used for one of three things, calling repos.enableRepo(), repos.disableRepo() and less often repos.listEnabled(). The latter for if you need to set/override some specific configuration for the repos.

The pkgSack and rpmdb attributes have a fairly large number of functions you can call, most of which return "package objects" these are the main things you work with most in yum code. Probably the most useful functions to get those package objects are: searchNevra(), returnPackages() and searchPrimaryFields(). There are also some optimized varients like, searchNames() and returnNewestByNameArch(). Some examples would be:

Simple version of "yum list" command

   11 # Get the repository package objects matching the passed arguments
   12 pkgs = yb.pkgSack.returnNewestByNameArch(patterns=sys.argv[1:])
   14 for pkg in pkgs:
   15     print "%s: %s" % (pkg, pkg.summary)

Simple stats. gathering from installed packages

   17 # Find the ten biggest installed packages
   18 pkgs = yb.rpmdb.returnPackages()
   19 pkgs.sort(key=lambda x: x.size, reverse=True)
   20 print "Top ten installed packages:"
   21 done = set()
   22 for pkg in pkgs:
   23     if pkg.name in done:
   24         continue
   25     done.add(pkg.name)
   26     print "%s: %sMB" % (pkg, pkg.size / (1024 * 1024))
   27     if len(done) >= 10:
   28         break

Slightly more advanced topics

After playing with yum code for a little bit, you'll probably experience a function which you might think would return a "package object" but doesn't returning a tuple of data instead. The two common tuples within yum are the "package tuple" and the "dependency tuple" which are:</b>

   30 # Pacakge tuple:
   31 (pkg.name, pkg.arch, pkg.epoch, pkg.version, pkg.release)
   32 # Dependency (or the Provides/Requires/Conflicts/Obsoletes (PRCO)) tuple:
   33 (pkg.name, 'EQ', (pkg.epoch, pkg.version, pkg.release))

After that you'll probably start playing with the "transaction info" (yb.tsInfo), and the "install", "update" and "remove" functions of YumBase. So you can change the system as well as query it. Then you'll want to present your information in a way that looks as good as a normal yum command (using the "internal" output module), although that is less unsupported. A somewhat useful example might be:

"Clever" way to manually update almost all the metadata for any usable repos. installed

    1 #! /usr/bin/python -tt
    3 import os
    4 import sys
    5 import yum
    7 yb = yum.YumBase()
    9 yb.conf.cache = os.geteuid() != 0
   11 from urlgrabber.progress import TextMeter
   13 # Use the "internal" output mode of yum's cli
   14 sys.path.insert(0, '/usr/share/yum-cli')
   15 import output
   17 # Try not to be annoying if run from cron etc.
   18 if sys.stdout.isatty():
   19     yb.repos.setProgressBar(TextMeter(fo=sys.stdout))
   20     yb.repos.callback = output.CacheProgressCallback()
   21     yumout = output.YumOutput()
   22     freport = ( yumout.failureReport, (), {} )
   23     yb.repos.setFailureCallback( freport )
   25 # Enable all the repos. a user might want to use and sync. the metadata.
   26 # Note this needs to be done before the repositories are used.
   27 for name in ('updates-testing', 'rawhide', 'livna', 'adobe-linux-i386',
   28              'brew', 'rhts', 'koji-static'):
   29     yb.repos.enableRepo(name + ',')
   30 for repo in yb.repos.listEnabled():
   31     yb.repos.enableRepo(repo.id + '-source'    + ',')
   32     yb.repos.enableRepo(repo.id + '-debuginfo' + ',')
   33 yb.repos.doSetup()
   34 for repo in yb.repos.listEnabled():
   35     repo.mdpolicy        = 'group:main'
   36     repo.metadata_expire = 0
   37     repo.repoXML
   38 # This is somehwat "magic", it unpacks the metadata making it usable.
   39 yb.repos.populateSack(mdtype='metadata', cacheonly=1)
   40 yb.repos.populateSack(mdtype='filelists', cacheonly=1)

Hopefully that will go some way to giving you an overview of how you can use the yum API to perform queries or tasks that would otherwise be very difficult. For further information you can use the help feature of ipython to look at docstrings for the variuos components, and even use the TAB complete feature of ipython to see all the available attributes of packages, repos, pkgSack or the YumBase itself.


Date:August 30th, 2008 04:04 pm (UTC)


Very nice introduction.
(Reply) (Thread)