Among the many pies I have my thumbs into at the moment, I am particularly interested in using technology to bring greater transparency to government. One of the most prominent problems as it relates to government transparency might be surprising: while most people immediately think of deliberate secrecy as the pre-eminent threat of transparency, simple dysfunction plays at least as large a role in preventing public access to state records. Immense troves of data remain solely available on ink & paper. Information that has been computerized remains in private intranets. Even data that is online, organized and available remains in a format that prevents semantic contextualization - either by storing documents in image files (TIFF) or difficult to decipher compressed formats (PDF or XPS). And in the rare cases where government agencies have made information public, semantically decipherable and accessible over the internet the problem remains of indexing that data using a common s