Do you have naked HTML out there?

A large majority (I hope) of university sites are using a content management system to control the thousands of pages which make up a single university web presence. I am willing to guess that multiple people are editing the content on those pages and they are using some sort of WYSIWYG editor.

CMS’s come in a million different shapes and sizes but their primary goal is to mash up a template with the content for each page. It doesn’t matter if you use a single template for every page (unlikely and ill advised in my opinion) it still has to be combined with the writers content.

Content editors can be your worst enemy (sometimes)

The difficulty comes when you cannot control the elements and formatting that a content editor puts in the CMS. As a designer or developer have their ideals, it rarely turns out that way.

I have seen web content editors who are not web writers by trade, they are usually secretaries, tech support or just the most tech savvy person in the department. And often they do not edit web pages every day, so when they do maybe they forget to hit the “Paste as Plain Text” or “Paste from Word” button and now their site is filled with a bunch of HTML junk.

Are you planning for all this junk?

I recommend a two prong approach, first using HTML Tidy to clean up the HTML on the server end before it even gets to the browser. Now this will create some clean code but it will also give the user a misrepresentation of what they may have thought they put in. But it is for their benefit so make sure its brought up during CMS training.

The second is to make sure you style ALL HTML elements that could possibly be entered. A designer or developer would hate to get caught with some naked <dl>’s or <address> tags.

I have been using the XHTML, CSS Style Guide by Ross Johnson which allows you to import a CSS style sheet and see how it reacts to every HTML element out there (I am pretty sure he covers all of them). It can work one of two ways, either view the style on his site or you can just copy and paste the HTML into a new page on your site and style away!

Planning ahead

Planning ahead for these types of things can avoid embarrassments down the road when you have an under or over active web writer on your hands.

Last but not least I cannot end without mentioning CSS resets which are either loved or despised by developers out there. Worst comes to worst you can use Eric Meyer’s CSS Reset or Yahoo YUI Reset to obtain a completely flat style less content area to build upon. Although it will add some weight to your CSS pages it will give you the finest control of all your elements.

What’s your strategy?

Are you a designer making sure you account for all this content? A developer endlessly styling tags hoping you don’t forget any? Or a writer trying to clean up your Word pasted HTML?

8 Responses to “Do you have naked HTML out there?”

  1. Says:

    The college I work at uses a PHP-based home-built CMS. We also use TinyMCE for placing content in pages. It it great for posting content from Word into HTML. However, the results aren’t always perfect, whether caused by TinyMCE’s limitations or the nature of Word’s formatting. When that happens, I use “It’s All Text!” (a firefox extension) to open the content from TinyMCE’s “HTML view” with vim, and use regular expressions (or manually) clean up the code.

  2. Says:

    This is also useful for preparing your site for all the different flavors of mobile browsers out there, their html interpreters can be rather strict. Great info.

  3. Says:

    We use HTML Purifier to allow only certain whitelisted tags (and attributes) through.

  4. Says:

    @rdundon But how do you deal with the hundreds of pages out there that you don’t oversee? Or do you actually hand clean up every page out there?

    @Will For sure the cleaner, lighter the better.

    @Web ManagerThat is awesome! What WYSIWYG editor do you use?

  5. Says:

    When I was Webmaster at Case Western Reserve Univ. we were not using a campus-wide CMS. We had close to 400,000 (public) pages on more than 500 servers. Some used CMS systems, some did not, but the primary server did not. We were exploring CMS options when I left the university, but one of the challenges was finding something that would produce standard compliant code, and be adaptable for users of varying skill levels.

    Since we didn’t have a CMS, I came to use our campus blog system (running Movable Type) in many situations that would otherwise merit a CMS. We used it for everything from our press releases to department blogs and podcasts. While users could use the WYSIWYG editor, I trained the press release writers and encouraged others to post in HTML. It helped keep things clean and helped us with search engine optimization.

    Since then I’ve gone out on my own as a designer and have been using WordPress as a CMS when necessary and it works well. But I’ve had to install a plug-in to make sure that it renders my HTML entries properly without adding line breaks and other quirky bits. I suspect this is similar to the problem rdundon has had with pasting content into a CMS.

    Content Management Systems come in all shapes and size, but the trick is getting them to render exactly what the templates specify. It seems there is always some tweaking to be done. You’re point about resets is a good one. If we have multiple users inputting in a variety of format, this is probably a very good scenario in which to use resets, so that we can assure that unless over-ridden by user-input our CSS has at least some chance of ruling the day.

  6. Says:

    Talking about the naked HTML, Nick. I was once told a web designer to design and decode the website for me. Well, I didn’t aware that the naked HTML can cause a big mess up for my website! I learned from the mistake and knew that naked HTML could be vital, if it didn’t handle properly.

  7. Says:

    Im obviously a big fan of Drupal. Its capable of running any kind of website, and has tons of modules that will protect against these kind of problems. Plus its completely free.

  8. Says:

    Your blog looks good. Have a nice day.. Come on the internet site and yourself check it!