What is it?¶
This repository consists of source documents in markdown that can be published to GitHub pages. It contains a build script in Python, to transform the source documents into published documents.
The source documents are all about file formats, and the data types involved, and the file extensions involved.
This is a big nest of interlinked information, and one of the tasks of the build script is to provide links between related pages whereever relevant.
How does it work?¶
The source documents in markdown are transformed to HTML pages for a static website in several stages. These stages are controlled by a build script.
The users are divided into owners, contributors and readers.
Readers consult the end-product by browsing: documentation pages.
Contributors propose changes to the source documents by creating pull-requests.
Owners accept or reject pull requests. When pull-requests are accepted, they re-build the documentation.
N.B. Eventually the building of the documentation should be automatically triggered upon accepting a pull-request.
More about pull-requests¶
Contributors should first fork the repo to their own space on GitHub. This is an online action, done on the GitHub website.
Then they should clone their fork to their own computer. This is an action that can be done by means of the GitHub Desktop application.
Now they can edit the
source documents locally, with the text-edit applications of their choice. That can be
Notepad for absolute beginners, or
Vim for absolute experts, or Atom for everybody else.
When they are done, they can push their modifications to their online fork on GitHub. This action can easily be performed with GitHub Desktop again.
When they want their changes to be published, they create a pull-request. This can also be started from the GitHub Desktop app, which will lead you to the GitHub website where you can fill in a new pull request.
Owners will be notified of the new pull request and can accept or reject it.
Details of the workflow¶
Here we describe the steps in the publishing workflow in greater detail.
In order to run the build script, you need the following Python modules:
pip3 install pyyaml mkdocs
python3 build.py make
make step reads the files in directory
source and produces resulting files in directory
It transforms markdown files into other markdown files, using interlinking information and several other settings are stored data.yaml.
It also generates a config file for
mkdocs which takes care of the next step. This generated mkdocs.yml is in the top level directory. It is generated on the basis of mkdocsIn.yml and a list of urls in urls.list.
The recommended practice is to never enter a url directly into the source files, but define an abbreviation for it in
urls.list and used the abbreviation in the markdown wrapped in double braces.
The generation process makes a navigation structure based on the existing documentation source files.
python3 build.py build
The step that generates HTML from markdown is the task of a static page generator. mkdocs is such a program, written in Python.
It reads markdown files in the
docs folder and produces HTML files in the
site folder. During the process it applies styles, headers, footers, and it creates navigation links.
The fine details of the website layout is in the mkdocs.yml which we have generated in the previous step.
site to the local browser¶
python3 build.py docs
This will display the website in
site in your browser.
site to the published on GitHub¶
python3 build.py g "commit message"
Use this command to publish the website in
site to GitHub pages. In fact, this steps also performs the previous
build steps. So you can completely publish the current contents with a single invocation of the build script.
Behind the screens, mkdocs will collect the contents of
site, store it in a separate branch call
gh-pages, and push the
gh-pages branch to GitHub. From there, GitHub takes over, and will publish the contents it finds in
https://Dans-labs.github.io/formats where all eyes of the world can see it.
Here you can see how the repo has been set up for GitHub Pages.
This step will also commit the changes to GitHub.
Work to do¶
In order to turn this pilot into a productive platform to collaborate on preferred formats documentation, the following things might be needed in due course:
Allocate a person that is knowledgeable with Python and GitHub to take care of the workflow. This person will play a role for all following steps.
make sure that build.py gets triggered on GitHub whenever a pull request gets accepted
write instructions for collaborators:
- point to an easy markdown guide
- point to GitHub Desktop and Atom
- explain the modification process by means of commits and pull requests
- explain how to file Issues
- explain what the true source files are and what the generated files are, so that people do not modify generated files and see their modifications getting lost the next time the build process is run
Extend the data model / adapt the conventions: So far, all texts have been written from the DANS perspective. You might want to make distinctions between recommendations over the participating institutions. You have to think about how to organize that. Maybe you need another folder
institutesbelow which they specify their particular recommendations per file type.
Rebrand the published content: Currently, the published GitHub Pages are in a neutral readthedocs style. You might want to develop a more DANS-specific template. I have done something already, see mkdocs-dans but this no longer works. However, a better thing to do might be the following:
Build other publication channels: Instead of (only) publishing content via GitHub pages, you might want to build other channels for other avenues. This is something that other institutes could do for themselves, or we could provide middleware to do that: a program that reads the source files, that can be fed with institute-specific settings and logos, and that will produce a set of custom static pages ready to be included by an institutional web site.
Modify the collaboration workflow: If the GitHub workflow of committing changes and creating/accepting pull-requests turns out to be too cumbersome for the target audience, you could consider to use Google docs for direct collaboration on texts, and only when agreement is reached there, to pull content from there and ingest it in this same GitHub repo. Owners will then push the changes and trigger the build script. Collaborators do not have to be aware of GitHub. You still have a lot of change history in your GitHub repository, and you still benefit from various benefits of GitHub, such as easy transfer of releases to Zenodo and archiving by the Software Heritage Archive.