Confused and uncertain about the future of Wordpress?
Don’t try to fix it–Wordprexit!
Actually, it’s just a Python command-line tool to port a Wordpress blog to Hugo. There exist many tools to do this job already, but they all have some shortcomings. Here’s what Wordprexit can do:
- Convert your Wordpress posts to Hugo posts. All the expected fields
are populated in each entry’s front matter. Messy HTML is filtered
through a port of the
wpautop
algorithm to provide paragraph breaks and converted to clean, valid Markdown. - Parse all the different Wordpress date/time formats–yes, there are several, including creative touches like intentional use of the year “-0001” and missing timezone information–into standard ISO format.
- Convert Wordpress
[shortcodes]
to HTML. - Replace
<img>
,<figure>
, and<blockquote>
HTML tags with Hugo shortcodes to provide better options for styling, responsive image handling, and so on. - Download embedded images in the largest size possible, regardless
of whether they were part of your Media Library, and place
them in a Hugo Page Bundle with your entry. Each Wordpress image title
is extracted (when present) and placed in your
resources
front matter for use by your shortcodes. - Convert all your comments to Staticman-compatible JSON files
in your
data/comments
directory.
Installation and use
It’s as simple as:
# Install CLI tool
pip install wordprexit
Now download the WXR file from Wordpress. You can do this from “Tools > Export” in the admin interface. If given the option, include both posts and media, so that the exported file contains post_type “attachment” and post_type “post” data. (The “attachment” entries are your Media Library metadata.)
# Parse file (will create content/ and data/ tree in current directory)
wordprexit wxrfile.xml
Source code
You can check out the source on GitHub: https://github.com/2n3906/wordprexit