Pushing content into any format with Jekyll
In the previous post, I talked about help APIs as a way to deliver help inside applications. In this post, I'll explain how to push your help content into any format.
Let's say that you have three different channels where you want to push your help content. Channel one is an S3 bucket in Amazon Web Services (requiring HTML), channel two is a Salesforce Knowledge center (requiring CSV), and channel three is your help API (requiring JSON).
Jekyll (and probably a lot of static site generators) provide an amazing capability here. With most help authoring tools, you see a list of outputs (e.g., PDF, HTML, Eclipse Help, etc.), and you're pretty much limited to those outputs. Jekyll, however, allows you to define the templates and format that your content is pushed into.
Let's remember the three different format channels for this scenario:
- For channel 1 (S3), the content needs to be in HTML.
- For channel 2 (Salesforce), the content needs to be in CSV (for batch import into Knowledge).
- For channel 3 (API), the content needs to be in JSON.
Pushing content into HTML
Since HTML is the main publishing use case, most of Jekyll is structured around facilitating this template. You author content in a Markdown or HTML page. At the top of this page, you specify the layout you want in the frontmatter, like this:
--- layout: page title: My Page permalink: /mypage/ ---
Jekyll will take all the content in this page and stuff it inside the {{content}}
tag in the layout you specified (page.html). Page.html also usually has a layout defined (default.html), so Jekyll takes all the content stuffed into the page.html layout and stuffs it into the {{content}}
tag on the default.html page.
(You could just specify your layout as default
from the beginning, but you might have various HTML layouts, such as a layout for pages, posts, and specific content types (such as API doc), which all plug into default.html.)
Pushing content into JSON
Now let's move to the JSON use case. Rather than stuffing content into {{content}}
tags, you create a file that looks like this:
--- layout: none search: exclude --- { "entries": [ ] }
"tooltips" is the name of a collection I created inside Jekyll. This code has the basic structure of the JSON that I want, but you'll notice some placeholders. A for
loop iterates through all the pages inside the tooltips collection and, with each page, the page's id gets inserted into the /2015/03/06/pushing-content-into-any-format-with-jekyll
placeholder, and the page content gets inserted into the {{page.content}}
placeholder.
Assuming our pages were short descriptions of sports, here's what the result might look like:
{ "entries": [ { "id": "baseball", "body": "Baseball is considered America's past-time sport, though that may be more of a historical term than a current one. There's a lot more excitement about football than baseball. A baseball game is somewhat of a snooze to watch, for the most part." }, { "id": "basketball", "body": "Basketball is a sport involving two teams of five players each competing to put a ball through a small circular rim 10 feet above the ground. Basketball requires players to be in top physical condition, since they spend most of the game running back and forth along a 94-foot-long floor." }, { "id": "football", "body": "No doubt the most fun sport to watch, football also manages to accrue the most injuries with the players. From concussions to blown knees, football players have short sport lives." }, { "id": "soccer", "body": "If there's one sport that dominates the world landscape, it's soccer. However, US soccer fans are few and far between. Apart from the popularity of soccer during the World Cup, most people don't even know the name of the professional soccer organization in their area." } ] }
Pushing content into CSV
CSV requires a different format from either HTML or JSON. And admittedly, here's where things get a little theoretical because I haven't actually tested this.
Here's a typical CSV format that I just pulled off the web:
policyID,statecode,county,eq_site_limit,hu_site_limit,fl_sit e_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 119736,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1 448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3
You have a top row of comma-separated values, and then data in rows below that following the same pattern. Commas separate each value.
In Jekyll, you would first make sure your pages had all the frontmatter tags corresponding each of these CSV headers. Here's what one page might look like:
--- policyID:119736 statecode:FL county: CLAY COUNTY eq_site_limit: 48960 hu_site_limit: 498960 fl_sit: 498960 e_limit: 498960 fr_site_limit: 498960 tiv_2011: 498960 tiv_2012: 792148.9 eq_site_deductible: 0 hu_site_deductible: 9979.2 fl_site_deductible: 0 fr_site_deductible:0 point_latitude: 30.102261 point_longitude: -81.707664 line: Residential construction: Masonry point_granularity: 3 ---
You then create a file with a .csv extension, such as data.csv. In this file, you add some basic frontmatter at the top so that Jekyll processes the file as a page. And then you iterate through each of the pages using a for
loop and stuff the data into your CSV template.
I'll pretend that I've created a collection here called "policies," and that each of my pages exists inside _policies.
--- layout: none search: exclude --- policyID,statecode,county,eq_site_limit,hu_site_limit,fl_sit e_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
See how you access content in the frontmatter using the page
namespace? page.policyID
gets the value for the policyID
in the frontmatter, and so on. The for
loop would go through each of the pages and construct a new row of comma-separated values until it reached the end of the pages in the policies collection.
When you build your Jekyll site, Jekyll will recognize the data.csv file as needing to be processed because of the frontmatter tags. You will find a fully populated data.csv file in your build folder.
Not bound by format
Because of this flexibility in constructing templates to stuff content into, you're not bound by a specific format, for the most part. You create your content, decide on the template, and then Jekyll shoves the content inside the template. The template could be HTML, JSON, CSV, or something else. This way you can author content in a way that is separate from format. (You could equally create a template that stuffs this same arcane policy information content into a JSON file, for example.)
No doubt many tools do a similar kind of thing on the backend. I just never really understood what was happening when I selected a certain output. Jekyll exposes this processing in a clear and simple way. Now your content can travel into any number of systems in a seamless way.
There is at least one limitation with the formats, though. You can't really create a DITA template and push the content into a DITA format except maybe in the most general way, with body
including just HTML. This is because DITA has some very specific structures, and this simplistic template method won't really wrap lists inside task elements, convert links inside pages into xrefs, enforce element order, and so forth.
Other more sophisticated formats might have similar restrictions. However, my point is that Jekyll allows you to separate out your content from the template (presentation), and this is a huge deal when it comes to processing and displaying information.