…and I'm all out of bubble gum.
Fixing PowerPoint 2008 Web Exports
So, I’ve spent the last few days wrestling with a curriculum unit that an outside consultant built. In PowerPoint. On Windows. And which we have been trying to set up in such a way that we can share the interactive document with students. Who are using Macs. And, perhaps, without asking each student to download a ~200MB file to use it.
I have learned and grown much in the process. And have discovered that Microsoft PowerPoint 2008 does an execrable job of exporting PowerPoints as web pages (it does an execrable job of doing a lot of other things too, but we can talk about that at another time). Here are the key fixes that I made to the exported web page and supporting files so that the presentation would fundamentally work (all of this was done using regular expressions in TextMate):
- I stripped out all of the fancy Javascript calls that PowerPoint inserted as links to navigate from one slide to another. It turns out that a simple HREF to the actual page’s HTML file works (and the JavaScript Does Not.)
Find:(href=")[^"]*(slide\d{4,4}.htm)[^"]*(")Replace with:
$1$2$3
- The export to web page takes all of the already URL-encoded links in the PowerPoint and reencodes them, rendering them useless. I stripped off the second encoding.
Find:(%)25([a-fA-F0-9]{2,2})Replace with:
$1$2
- Finally, because the links were built in Windows and then URL-encoded, all of the Windows-style paths needed to be turned into POSIX paths for use on the web.
Find:%5[cC]
Replace with:
/
At this point, in an average PowerPoint, most of the damage has been fixed and things more or less work. However, the curriculum unit that we were working with also linked to external Word documents (hence some of the Windows-style path issues above). This meant I had a few more fixes along the way that are worthy of note:
- I replaced the links to Word documents with links to the corresponding PDF files (and script I used generated PDF files with .doc.pdf extensions and I didn’t bother to fix that).
Find:(href="[^"]*docx?)(")Replace with:
$1.pdf$2
- These links to external documents open in the same frame as the slideshow. Which defeats the purpose of the slideshow being a navigational tool. So I redirected all of the new PDF links to a new window in the browser. As the hyperlinks are broken across two lines in the HTML source code, this took two steps.
- Find (changing {{name of Links & Sources folder}} to the, well, actual name of the Links & Sources folder):
(href="((http://)|({{name of Links & Sources folder}}))[^"]*")\nReplace with:
$1
- Find (modifying as noted above):
(href="((http://)|({{name of Links & Sources folder}}))[^"]*"\starget=")_top(")Replace with:
$1_blank$5
- Find (changing {{name of Links & Sources folder}} to the, well, actual name of the Links & Sources folder):
Related posts
| Print article | This entry was posted by Seth Battis on September 24, 2009 at 11:18 AM, and is filed under Educational Technology, How To. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |


about 11 months ago
It seems there’s at least one more thing that’s broken in the PowerPoint 2008 “Save as Web Page…” export: hyperlinks that are more granular than a whole object are lost. For example, If you have, say, a bulleted list of items and each item is a separate hyperlink, those links are not exported. The quick fix that I worked out was to break each item of the bulleted list out into its own text box and hyperlinking the textbox itself (rather than just the text) before exporting to the web. Inelegant, but it works (and no, simply breaking apart the bulleted list into separate text boxes did not fix the problem: even with a single hyperlink per text box, the links were not exported below the object level).
about 9 months ago
As I have been converting more of these units, I have realized that it makes sense in #2 immediately above (forcing links to PDFs to open in new windows), when I am changing the target of the links to supporting documents, to change not just links to PDFs, but also to MP3s, AIFFs, MP4s, etc. My approach has been to instead do the following two steps (and I have also updated my tweak to use _blank, which is valid in the document object model for a web page, while _new is not (although almost always handled correctly anyway).
about 9 months ago
It really says something about how lousy PowerPoint is, both in general and as a “web development platform” that I’m still tweaking this dang post. Just fixed the regex for replacing links to Word documents with their PDF equivalents to take into account DOCX files, which have snuck into the most recent PowerPoints I’ve had to work with. Also, it seems like the update from two days ago should redirect not just links to presentations’ “Links & Sources” folders, but also links to other sites into a new window. Updated that regex too. Argh.
about 9 months ago
And now… I’ve just converted the whole shebang into an AppleScript (that depends on TextWrangler, since TextMate is less scriptable, although more usable — I should really just figure out how to write a TextMate bundle for this).