Yet another idea for a project

Library of Congress has dumped a lot (like a LOT) of data on the net:

Including their catalogue. I was wondering… Would it be possible to build a machine learning algorithm, that returns a subject, based on title? And maybe other information?

Something to look into during the long hours in the summer, where the boss is on holiday, the patrons are away, and we have time to do interesting stuff? Not that we are not doing interesting stuff already, but stuff that is interesting in it self.