| James Antill ( @ 2008-08-29 20:49:00 |
| Entry tags: | example, python, yum |
Programming with Yum in 5 minutes, or so
There are a lot of lines of code in yum, and it can be somewhat intimidating at first glace. However a significant amount of effort has been made to make simple things easy, and the hard things not so hard. The start of any code using yum, will almost certainly have these four lines (and always the first and last one :).
1 #! /usr/bin/python -tt 2 3 import os 4 import sys 5 import yum
Those lines just tell python, you'd you'd like to be able to use the yum code, and some stuff for the OS. Next the first bit of real code, and something which is also in almost every piece of code using yum:
7 yb = yum.YumBase()
This creates a yum instance, that you can work with. Then one more piece, that is very useful:
9 yb.conf.cache = os.geteuid() != 0
This just tells the yum instance not to try and update any of it's data, as the caller of the script probably hasn't got the permissions to do so.
Now we have a real yum object that we can do things with, the three most useful parts to access are pkgSack, rpmdb and repos. The first two basically act the same, but rpmdb performs queries based on the installed packages on the local machine and pkgSack performs them against all the enabled (normally remote) repositories. The repos attribute is almost always used for one of three things, calling repos.enableRepo(), repos.disableRepo() and less often repos.listEnabled(). The latter for if you need to set/override some specific configuration for the repos.
The pkgSack and rpmdb attributes have a fairly large number of functions you can call, most of which return "package objects" these are the main things you work with most in yum code. Probably the most useful functions to get those package objects are: searchNevra(), returnPackages() and searchPrimaryFields(). There are also some optimized varients like, searchNames() and returnNewestByNameArch(). Some examples would be:
Simple version of "yum list" command
11 # Get the repository package objects matching the passed arguments 12 pkgs = yb.pkgSack.returnNewestByNameArch(patterns=sys.argv[1:]) 13 14 for pkg in pkgs: 15 print "%s: %s" % (pkg, pkg.summary)
Simple stats. gathering from installed packages
17 # Find the ten biggest installed packages 18 pkgs = yb.rpmdb.returnPackages() 19 pkgs.sort(key=lambda x: x.size, reverse=True) 20 print "Top ten installed packages:" 21 done = set() 22 for pkg in pkgs: 23 if pkg.name in done: 24 continue 25 done.add(pkg.name) 26 print "%s: %sMB" % (pkg, pkg.size / (1024 * 1024)) 27 if len(done) >= 10: 28 break
Slightly more advanced topics
After playing with yum code for a little bit, you'll probably experience a function which you might think would return a "package object" but doesn't returning a tuple of data instead. The two common tuples within yum are the "package tuple" and the "dependency tuple" which are:</b>
30 # Pacakge tuple: 31 (pkg.name, pkg.arch, pkg.epoch, pkg.version, pkg.release) 32 # Dependency (or the Provides/Requires/Conflicts/Obsoletes (PRCO)) tuple: 33 (pkg.name, 'EQ', (pkg.epoch, pkg.version, pkg.release))
After that you'll probably start playing with the "transaction info" (yb.tsInfo), and the "install", "update" and "remove" functions of YumBase. So you can change the system as well as query it. Then you'll want to present your information in a way that looks as good as a normal yum command (using the "internal" output module), although that is less unsupported. A somewhat useful example might be:
"Clever" way to manually update almost all the metadata for any usable repos. installed
1 #! /usr/bin/python -tt 2 3 import os 4 import sys 5 import yum 6 7 yb = yum.YumBase() 8 9 yb.conf.cache = os.geteuid() != 0 10 11 from urlgrabber.progress import TextMeter 12 13 # Use the "internal" output mode of yum's cli 14 sys.path.insert(0, '/usr/share/yum-cli') 15 import output 16 17 # Try not to be annoying if run from cron etc. 18 if sys.stdout.isatty(): 19 yb.repos.setProgressBar(TextMeter(fo=sys.stdout)) 20 yb.repos.callback = output.CacheProgressCallback() 21 yumout = output.YumOutput() 22 freport = ( yumout.failureReport, (), {} ) 23 yb.repos.setFailureCallback( freport ) 24 25 # Enable all the repos. a user might want to use and sync. the metadata. 26 # Note this needs to be done before the repositories are used. 27 for name in ('updates-testing', 'rawhide', 'livna', 'adobe-linux-i386', 28 'brew', 'rhts', 'koji-static'): 29 yb.repos.enableRepo(name + ',') 30 for repo in yb.repos.listEnabled(): 31 yb.repos.enableRepo(repo.id + '-source' + ',') 32 yb.repos.enableRepo(repo.id + '-debuginfo' + ',') 33 yb.repos.doSetup() 34 for repo in yb.repos.listEnabled(): 35 repo.mdpolicy = 'group:main' 36 repo.metadata_expire = 0 37 repo.repoXML 38 # This is somehwat "magic", it unpacks the metadata making it usable. 39 yb.repos.populateSack(mdtype='metadata', cacheonly=1) 40 yb.repos.populateSack(mdtype='filelists', cacheonly=1)
Hopefully that will go some way to giving you an overview of how you can use the yum API to perform queries or tasks that would otherwise be very difficult. For further information you can use the help feature of ipython to look at docstrings for the variuos components, and even use the TAB complete feature of ipython to see all the available attributes of packages, repos, pkgSack or the YumBase itself.