This page looks best with JavaScript enabled

How to convert Markdown to PDF

 ·  โ˜• 8 min read  ·  โœ๏ธ Nishant
How to convert Markdown to PDF

Writing documentation is part of almost all developer jobs. Although writing code is the primary focus for any developer, the core understanding/reasoning of the implementation can only be captured in a well-written human-readable documentation.

I put emphasis on human-readable because this documentation needs to be understood by, well humans. If we throw in a lot of jargon words, acronyms and code without explaining it in a simple language then it is close to not usable by people.

Another aspect that is important is how the documentation is formatted. If it is just blobs of text it becomes boring to read and people tend to just skip parts of it. This is easy to fix. The best and easiest way is to write in Markdown, which takes care of syntax highlighting as well as is simple to write. It is written following a certain syntax and is saved to a file with an extension of .md. In fact, this post is written in markdown ๐Ÿ˜Ž

The only problem is that many folks, don’t actually know how to consume this .md file. Most folks are familiar with a more common format i.e .pdf file. Almost all major operating systems can handle PDF files.

I recently had to write a specific integration guide as part of documentation and deliver it to a client separately from our hosted documentation. Now I could totally write all of that in a Word doc/Normal Text file or in a Google Doc. However, I don’t get all the nice features that markdown provides. Plus I have to deal with the GUI tools of each of these tools. So I decided to write the integration guide as a markdown file. The whole write-up was quickly done as I have been using markdown for some time now. Now comes the tricky part.

I needed to provide this file as a PDF file. So I started searching for possible ways I could do that.

The golden tool to pick for this would be pandoc.

Read about how to install pandoc here.
NOTE: In order to render pdf, pandoc requires latex installed. Make sure it is installed before proceeding.

Considering our markdown file is called integration_guide.md, below command would convert it to an integration_guide.pdf:

1
pandoc integration_guide.md -s -o integration_guide.pdf

As soon as I ran this command, it threw an error:

1
2
Error producing PDF.
! Package inputenc Error: Unicode character ๐Ÿ˜… (U+1F605)

So it cannot handle emojis! Oh no ๐Ÿ˜•

Anyways, I still wanted to confirm it works, so I got rid of all emojis in the markdown file and ran the command again.

Pandoc rendered markdown as pdf

Checkout the final generated integration_guide.pdf file.

Simple and fast!

This should solve the problem. Well not really ๐Ÿฅบ

The thing is that I like Github Flavored Markdown (Yes there are different flavors of Markdown, like icecream ๐Ÿฆ).

Turns out while using pandoc this is not quite possible (or I couldn’t find it. If you have an idea how to do it let me know on twitter ๐Ÿ˜…). Also, I found out that installing LaTex is a huge download ๐Ÿ‘€. Not to mention again, but not being able to render emojis is already a deal-breaker for me because I โ™ฅ๏ธ using emojis. So I started looking for some other tool that would let me render my markdown into a PDF that looks the same as how Github renders markdown.

After an hour of trying out various tools, I stumbled upon grip. From the Github repository

Render local readme files before sending off to GitHub.

Grip is a command-line server application written in Python that uses the GitHub markdown API to render a local readme file. The styles and rendering come directly from GitHub, so you’ll know exactly how it will appear. Changes you make to the Readme will be instantly reflected in the browser without requiring a page refresh.

NOTE: Make sure grip is installed by executing pip3 install grip in your directory.

Using grip for my use-case was as simple as running

1
grip integration_guide.md

… this will start a local server with a url, http://localhost:6419/

1
2
3
4
5
6
 * Serving Flask app "grip.app" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://localhost:6419/ (Press CTRL+C to quit)

On opening http://localhost:6419/ in a browser of your choice (I use MS Edge), you are presented with a nice Github Flavored Markdown render of your file

Local server-rendered markdown file in browser

Sweet ๐Ÿ˜Ž

But wait, what is this fake tab in the rendered webpage?

fake tab in the render

In order to get rid of this “Fake Tab”, I checked the documentation of grip. Nothing stood out as a proper solution until I found this issue/comment.

Basically, the solution is to load the markdown via stdin and then export it as an HTML file. Sounds good. I did a bit of a mix and came up with the below oneliner command:

1
cat integration_guide.md | grip - --export integration_guide.html && open integration_guide.html

which I converted into a bash function like below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Drop this in your ~/.bashrc or ~/.zshrc file
# Use as: convertMarkdownToHtml your_markdown_file.md
# Dependency: Install grip, https://github.com/joeyespo/grip
function convertMarkdownToHtml(){
    if [[ $1 == *".md"* ]]; then
        # Filename without extension
        local FILE_NAME=$(basename "$1" .md)
        # Read the markdown file and then convert it to an HTML file
        cat $FILE_NAME.md | grip - --export $FILE_NAME.html 
        # Open the generated HTML file in a browser window
        open $FILE_NAME.html
    else
        echo "Passed file is not of markdown type. Please pass a .md file"
    fi
}

Using the bash function, I simply need to call the below line of code:

1
convertMarkdownToHtml integration_guide.md

At this point I can utilize the Print functionality of the browser (CMD + P/ Ctrl + P) to save as PDF:

Print dialog in browser

…and hit Print or Save.

Save file

On opening the generated PDF file I have:

PDF file

Checkout the final generated integration_guide.pdf file.

However, I am not quite happy with this solution. It works, yes, but I really don’t want to open a browser and then have a few keystrokes to print the final pdf file. Would be nice if I could skip this step and directly convert HTML file to PDF via some commandline tool. This gave me an idea ๐Ÿ’ก What if I could open a browser window in headless mode and print the document programmitically ๐Ÿง? Kind of like emulating the current process we figured earlier ๐Ÿ‘€

This is not new for me and I knew where to start for this part. The best tool out there for a headless and programmable browser is Puppeteer. Reading through the docs quickly for my specific use case, I came up with the below script, which I saved as renderToPdf.js:

NOTE: Make sure puppeteer is installed by executing npm i puppeteer in your directory.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
const puppeteer = require('puppeteer');

(async () => {
    // Get the cmdline arguments
    const myArgs = process.argv.slice(2);
    // The first argument is the HTML file
    const inputHtmlFile = myArgs[0]
    // Extract the filename from input file, to create the output filename
    const outputHtmlFileName = inputHtmlFile.replace(/\.[^/.]+$/, "") + '.pdf'
    // Prepare the file path
    const filePath = 'file:///' + __dirname + '/' + inputHtmlFile

    // Launch the headless browser
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(filePath, { waitUntil: 'networkidle0' });
     await page.pdf({
        path: outputHtmlFileName,
        printBackground: true,
        format: 'A4',
        margin: {
            left: "40px",
            right: "40px",
        }
    });
    await browser.close();
})();

Now in order to execute my renderToPdf.js script I need to execute in terminal the below command:

1
node renderToPdf.js integration_guide.html

Checkout the final generated integration_guide.pdf file via Puppeteer.

We can even incorporate this step as part of our bash function we created earlier. The updated function looks like below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Convert Markdown to PDF
# Use as: convertMarkdownToPdf your_markdown_file.md
function convertMarkdownToPdf(){
    if [[ $1 == *".md"* ]]; then
        # Filename without extension
        local FILE_NAME=$(basename "$1" .md)
        # Read the markdown file and then convert it to an HTML file
        cat $FILE_NAME.md | grip - --export $FILE_NAME.html 
        # Render HTML to PDF
        node renderToPdf.js  $FILE_NAME.html
        # Open the generated PDF file
        if [ `uname` == "Darwin" ]; then
            # macOS
            open $FILE_NAME.pdf
        else
            # Linux
            xdg-open $FILE_NAME.pdf
        fi
    else
        echo "Passed file is not of markdown type. Please pass a .md file"
    fi
}

Using the bash function, I simply need to call the below line of code:

1
convertMarkdownToPdf integration_guide.md

NOTE: The bash function has been renamed from convertMarkdownToHtml to convertMarkdownToPdf

…and the markdown file is converted to HTML first, opened in Headless browser and printed as a PDF file. Then opened in the associated PDF file reader!

That’s it! Everything works ๐Ÿ˜Ž It is quick and solved my immediate requirement.

All this code now exists also on Github as a project! ๐ŸŽ‰

Sidenote:
There are possibly many other tools, which could have been useful too i.e markdown-pdf. However I am quite happy with my setup here ๐Ÿ˜…

Share on
Support the author with

Nishant Srivastava
WRITTEN BY
Nishant
๐Ÿ‘จโ€๐Ÿ’ป Android Engineer/๐Ÿงข Opensource enthusiast

What's on this Page