A huge Powershell one-liner for Pandoc pdf-ing

So I have a directory containing markdown files, and also subdirectories, and sub-subdirectories, and so on, all containing my markdown files that I keep personal notes in. I find this an incredibly effective system as I can swiftly move around, search and edit these files in gVim. This directory is easily copyable to my smartphones, which can easily open text files in various ways, but sometimes pdf is simpler to open and search on a smartphone. So I would like all of these markdown files to be converted (by Pandoc) to pdf, and I would like to do this from time to time, efficiently. This would be helpful for use on my smartphones but have the added benefit that if I want someone else to access my personal notes, they’d be in both markdown and pdf format, which would be much easier for everyone.

I wondered if I could write a one-liner that would recursively search through the directory for all markdown files, and if there’s not an equivalent pdf with a later date, make one.  Scott Hanselman’s one-liner conviced me that I could. So here is the successful development of my first ever big Powershell one-liner:

Step 1. Recursively lits all of the markdown files by name only:

gci -r -i *.md |foreach{echo $_.name}

– I’ve used Get-ChildItem, and $_ for the current pipeline object.

Step 2. For existing markdown files in a directory, list the equivalent pdf files with their ”LastWriteTime”’s:

gci *.md|foreach{$pdf=$_.directoryname+"\"+$_.basename+".pdf";if(test-path "$pdf"){gi "$pdf"}}

– here I’ve used Get-Item.

Step 3. Recursively list pdfs with ”LastWriteTime”s earlier than markdowns:

gci -r -i *.md|%{$pdf=$_.directoryname+"\"+$_.basename+".pdf";if(test-path "$pdf"){$mdd=$_.LastWriteTime;gi "$pdf"|?{$_.LastWriteTime -lt $mdd}}}

– here I’ve replaced foreach with %.

Step 4. Recursively list all of the markdown filepaths with ”LastWriteTime”s, and each followed by the equivalent pdf filepaths with either their ”LastWriteTime”s, and, where the pdf is out-of-date, “- to redo” appended, or, if the equivalent pdf doesn’t yet exist, the commment “- not yet made” (which gives us a visual clear visual picture of which pdfs need to be re-Pandoc’d):

gci -r -i *.md|%{$md=$_.fullname;$mdt=$_.LastWriteTime;"$md -> $mdt";$pdf=$_.directoryname+"\"+$_.basename+".pdf";if(test-path "$pdf"){gi "$pdf"|%{$pdft=$_.LastWriteTime; if($pdft -gt $mdt){"$pdf > $pdft"}else{"$pdf > $pdft - to redo"}}}else{"$pdf -- not yet made"}}

– now giving this sort of output:

showing the output

Step 5. Add in a condition to simulate firing off Pandoc for those pdfs that either haven’t been made yet, or need redoing:

gci -r -i *.md|%{$md=$_.fullname;$mdt=$_.LastWriteTime;"$md -> $mdt";$gp=$false;$pdf=$_.directoryname+"\"+$_.basename+".pdf";if(test-path "$pdf"){gi "$pdf"|%{$pdft=$_.LastWriteTime; if($pdft -gt $mdt){"$pdf > $pdft"}else{$gp=$true;"$pdf > $pdft - to redo"}}}else{$gp=$true;"$pdf -- not yet made"}if($gp){"- go pandoc"}}

– just prints “- go pandoc” after each found case.

Step 6. The finished recursive one-liner, firing off Pandoc for all those markdown files whose pdf is either not there or out-of-date:

gci -r -i *.md|%{$md=$_.fullname;$mdt=$_.LastWriteTime;"$md -> $mdt";$gp=$false;$pdf=$_.directoryname+"\"+$_.basename+".pdf";if(test-path "$pdf"){gi "$pdf"|%{$pdft=$_.LastWriteTime; if($pdft -gt $mdt){"$pdf > $pdft"}else{$gp=$true;"$pdf > $pdft - to redo"}}}else{$gp=$true;"$pdf -- not yet made"}if($gp){"- running pandoc";&pandoc -V mainfont="Arial" --toc --toc-depth=4 -f markdown_strict $md -o $pdf --latex-engine=xelatex}}

– very effective for my needs – I can see what’s happening in the Powershell window as Pandoc’s working, in particular catch errors if any of my markdowns won’t convert to pdf.

You can find more Powershell, Markdown, and Pandoc pointers in my DokuWiki.

Share Button