2011/03/11

How to automatically embed hyperlinks into citations and bibliographical entries of a downloaded PDF document! Part I

Nowadays, it is possible for a researcher to amass a huge number of electronic papers and books about his specialized area of interest. Naturally there are a lot of cross-references between these sources. The usual way of managing these files is the creation of a bibliographical database in one of the numerous relative programs like EndNote, Papers, JabRef and BibDesk. In this post I will try to present an exciting new way of "managing" your files which in a way makes managing them obsolete!

Specifically I created a workflow for embedding DevonThink hyperlinks into the very citations and bibliography entries of a PDF file. As a result when you want to read a cited paper or book you do not need to go to your bibliography manager program or even worse to the folder hierarchy of your hard drive in order to find the file. Instead you can just click the citation or the title or the author of the bibliography entry and the respective PDF file opens instantly! The magic of DevonThink hyperlinks in action!

The required tools/programs for this dream to come true are the following:
  1. Adobe Acrobat Pro for Windows
  2. AutoBookmark plug-in for Adobe Acrobat Pro
  3. A Windows virtual machine 
  4. DevonThink
  5. BibDesk
  6. TextMate
  7. Link dictionaries
  8. The AppleScripts for creating automatically the link dictionaries from BibDesk
More details about the workflow

AutoBookmark searches page text for occurrences of specific words or phrases and generates links annotations according to the user specifications. Link dictionary is a collection of search terms (words, phrases and etc.) and corresponding link actions. When it finds a search term from a dictionary it automatically creates a link annotation using a link action that is associated with it. Each search term should be unique within a link dictionary (excerpt from AutoBookmark help file). So you need to have an appropriate link dictionary so as the hyperlinks to the sources to be created automatically by AutoBookmark.

For our case I consider optimal the use of two different dictionaries each with its own advantages and disadvantages. The first contains  various possible forms of author-year citations e.g. Clark 2008, Clark (2008, Clark Stephen (2008, Clark S. 2008, Clark S. (2008. The advantage of this dictionary is that the matches are almost 100% unique. However the probability of the match is somewhat small because of the high variability in the way that the citations are made. The second contains the source titles. Now the situation is the complete opposite because you can very easily have false positive matches if the title is too generic but there is no case of a missing match if the title is adequately complex. 

In my next post I will describe how you can create the link dictionaries from your Mac.

2 comments:

mbb said...

wow - sounds complex and mighty, I'm looking forward to your description how it works.

Do I understand correctly that you want so describe how to automatically generate hyperlinks in existing pdfs (e. g. downloaded papers) to the local file of a cited reference?

Euboulides said...

Yes, you understand correctly.