Perl’s *DBM_File packages: what the heck are you people thinking?

Among many other things, my day-job involves working on a web-based system, written in Perl, for administrating our MasterKey metasearching system.  We call the admin system MKAdmin.  It was developed on Debian GNU/Linux, since that’s our usual production platform, but I also work on it from my Mac laptop.

Excuse me if I rant for a moment.

MKAdmin, until recently, used the NDBM_File module to implement a simple persistent session store — just an on-disk hashtable that maps cookie values to session structures.  NDBM_File is part of the core Perl distribution.

But when one of our partners wanted to run it on Fedora Linux, it became apparent that in their infinite wisdom the Fedora packagers have removed the NDBM_File module from their Perl package, so MKAdmin wouldn’t run. Understand: it isn’t that they didn’t provide an RPM package for the NDBM_File module, it’s that they removed that module from the Perl package, even though it’s part of the Perl core distribution.

So I changed MKAdmin to use the functionally equivalent GDBM_File module, which is also part of the core Perl distribution but hasn’t been sabotaged by the Fedora people.  And so all was well.

Except that now all the MKAdmin applications I have running on my Mac laptop no longer work.

It turns out that in their infinite wisdom the MacPorts packagers have removed the GDBM_File module from their Perl package, so MKAdmin won’t run. Understand: it isn’t that they didn’t provide a port package for the GDBM_File module, it’s that they removed that module from the Perl package, even though it’s part of the Perl core distribution.

Holy crap, people! Can you all please just stop pooping all over the Perl distribution? Would you please refrain from doing additional work in order to remove functionality? Would you please just LEAVE WELL ENOUGH ALONE?!

So now I am going to modify MKAdmin again, so that instead it uses the (also functionally equivalent) SDBM_File module — which, I’ve checked, does exist on Debian, Red Hat and MacPorts.

But, really, folks.  It’s so avoidable.

Just stop it.

17 responses to “Perl’s *DBM_File packages: what the heck are you people thinking?

  1. I guess you can install in Fedora the offending package using CPAN.

    May be you could try AnyDBM_File too.

    use warnings;
    use strict;
    
    use AnyDBM_File;
    
    my %hash;
    
    tie %hash, 'AnyDBM_File', 'test.db';
    
    if($hash{'stored'})
    {
    	print "Look out! $hash{'stored'}\n";
    }
    else
    {
    	$hash{'stored'} = 'Store this, no matter what *DBM_File is used';
    }
    
    untie %hash;
     

    In my case (Fedora 12), file says “test.db: Berkeley DB (Hash, version 9, native byte-order)”.

  2. You can’t (easily) install either GDBM or NDBM from CPAN, precisely because they are both part of the core Perl distribution. They have no distributions of their own. No doubt it would be possible to download the whole Perl distribution and extract the relevant bits, but I can easily imagine that getting messy very quickly, and I don’t want to go there.

    … but I do wonder whether AnyDBM_File might have been a better solution.

    Well, the next time I have to move MKAdmin to a different persistent session-store back-end, I’ll try that. *sigh*

  3. The point of AnyDBM_File is it’ll try to use any *DBM_File in the system.

    If there’s one installed, you’re done as long as you’re using an interface that it’s common to all *DBM_File packages.

  4. I think that the problem you’ve had is part of a larger one within free software. Many distributions make changes from the upstream projects, unnecessarily.

    It /shouldn’t matter/ in most cases what distribution you use, because it is the same programs that are being installed on all of them. I have often had the experience of searching the Web to see if anyone has had the same problem with a particular program, and have gotten forums specific to a distribution (e.g. Ubuntu) where people are suggesting solutions (installing distribution specific packages) that are irrelevant to people using a different distribution method, and in these cases there will always be a distribution-nonspecific solution but because the world of free software is split up like this it can be hard to find anyone discussing it.

    One reason I have installed Slackware is that it tends to provide unmodified upstream packages, and is unobtrusive as a package management system.

  5. Not being a perl guy, I have to ask: why are there three functionally equivalent modules?

  6. Aric, you can probably guess that the answer is: Historical Reasons.

  7. Ah, yes, the perennial answer to “why is this the way it is?”

    I’ve used that one many times. Especially when I don’t want to have to explain the actual reasoning behind how it ended up that way. Especially when the new manager isn’t going to understand that reasoning, and is going to want to go back and change it all to be “proper” even though thats going to break existing stuff built on top of it and will cause the project to run overtime with no practical benefit to anyone.

    But I digress…

  8. So why on earth there are three core packages that do the same thing exactly?

  9. Omer Katz asked:

    So why on earth there are three core packages that do the same thing exactly?

    Since Aric Caley asked the same thing, I guess an actual answer might be in order. First there was DBM, the original hash-table-on-disk library provided by Unix. That was superceded by NDBM (New DBM) which was better in various ways, including a diferent on-disk representation (and IIRC a wider and less modal API, introducing for the first time the concept of a descriptor for an open NDBM file, so that you could have two or more of them open at once). Then came GDBM (GNU DBM) and SDBM (no idea) which also offered additional functionality and/or performance and had different on-disk formats.

    If you are writing a Perl program that needs access to a *DBM file generated by another packages, then of course you need to use the correct *DBM_File module — I can’t read your NDBM files with my GDBM-based Perl program — which is why Perl comes with modules for all the different *DBM packages. Except of course when it’s sabotaged by a packager.

  10. When I search around trying to find out why Fedora might have wanted to remove NDBM_File, I find references to BUG: 84671 “perl: should something obsolete perl-NDBM_File?”

    But no discussion of why, only of whether. I see one comment claiming that it was obsoleted in rhl 8, and then unobsoleted in rhl 9, but still nothing about why.

  11. Thanks for the pointer, Jeff. This led me to the email archived at http://www.redhat.com/archives/fedora-perl-devel-list/2005-May/msg00002.html which contains the least uniformative comment I’ve seen on the subject:

    It was removed because it really isn’t necessary; there is DB_File
    which does the same, and on Linux, NDBM is just GDBM anyway, which is
    part of perl. This reduced the confusion and magic of the NDBM module
    since ndbm varies from platform to platform, making portable files
    impossible.

    I can understand what the thinking was here; it was wrong, but it wasn’t incomprehensible. They just didn’t think through the consequences.

  12. Nathan Myers

    What it doesn’t explain is why Apple removed GDBM from their Perl. Of course we will never get an official answer to that, but we know the answer already: if you remove NDBM to drive toward maximizing portability, you remove GDBM to sabotage portability. It is clearly in Apple’s interest to make porting away from Apple harder. That it made porting to Apple harder in this case is something Apple is evidently willing to live with.

  13. This is extermely strange. Usually you just change the version number. Well Perl says that there is more than one way to do it. There is a reason why I like python so much :)

  14. Vanilla packages are ftw. (I use Arch, thanks for asking!)

  15. Benjamin Franz

    GDBM probably went away forthe same reason I don’t use it anymore (and haven’t for several years). It’s buggy, undocumented, and unmaintained. The last release of the underlying library was eight years ago for goodness sake.

    Really, if you are going to use a local key/value store just use BerkeleyDB.

    DBM/NDBM/GDBM are all really just bad ideas nowadays.

  16. I hate to reopen an old thread, but I stumbled across this in a search for the answer to a related question, and since it’s still out there, I just want to clarify:

    Perl does not provide three modules with equivalent functionality. Like the more modern DBD/DBI sytem for SQL access, the *DB_FIle modules are all interfaces API provided by external systems. NDBM, GDBM, BDB, etc., are all libraries or services commonly provided by systems Perl commonly runs on. But not all services are provided by all OS. Until the introduction of SDBM, perl provided no native on-disk hash store of it’s own, since perl’s built-in hash functions provided the services other languages relied on *DB_File implementations for.

    The functions are part of the core distribution (and have been since the very beginning, before CPAN) to allow Perl to maintain files written by other processes. There are so many of them because back in the day, each mainframe OS provided its own DBM implementation but Perl was intended to be as portable as possible, so if you were moving to a new system, all you needed to do was update the “use” line (or use AnyDBM_File).

    The expectation that NDBM might run on some Linux, or GDBM might be installed on Solaris, etc., is relatively new, and although some OS do supply multiple *DBM versions, others still only ship with whatever their traditional database is. The missing perl module is normally simply a reflection of the missing underlying functionality. If Red Hat has removed the perl NDBM_File module, it’s almost certainly because they’re dropped NDBM itself from the system. Even if the Perl module were installed, it wouldn’t do anything.

    Likewise, Apple historically doesn’t ship any GNU software for licensing reasons, which means no GnuDB (GDBM) interface. GDBM_File is probably dropped for a similar reason. Even if they built their default Perl with GDBM_File, it would fail, because GDBM itself isn’t provided by Apple.

    As someone else suggested, the problem is that whoever wrote the original routine tied it to an OS-specific feature, and AnyDBM_File is your friend.

  17. Thanks for this historical background, Jay.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s