Rick Strahl's Weblog  

Wind, waves, code and everything in between...
.NET • C# • Markdown • WPF • All Things Web
Contact   •   Articles   •   Products   •   Support   •   Advertise
Sponsored by:
West Wind WebSurge - Rest Client and Http Load Testing for Windows

Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit


:P
On this page:
West Wind HTML Help Builder released last week, and so far it looks like the release was reasonably solid. So far I've only heard of a few odd install problems that were fixed immediately with a new upload.

Since last week I've been mucking around trying to iron some additional issues in the HTML editing. The HTML editor is a mixed bag - it's vastly improved over the functionality that was previously with interactive menus and all of the menu choices from text mode also working in code mode. More importantly code pasting and limited table editing are now available in visual mode.

Personally I like to work in plain text/markup mode in Help Builder which uses plain text for most everything and special HTML like (<< >> tags instead of single < > tags). One way that I have been forcing myself to do is write these BLOG entries with the editor which is about the most writing I do on a regular basis <g>.

Anyway, one of the problems that the HTML editing in Help Builder has currently is that the Microsoft DHTML Edit control has some serious focus problems. It started with a bug report from Dan Tollefson who mentioned that he saw some really odd focus problems with Tablet PC when using tablet handwriting input. He reported that once the input was made focus would not properly return to the Edit control unless he clicked on another form or control first. I wasn't able to duplicate this right away but noticed the focus problems in other places when Help Builder itself pops up dialogs for link and crosslink entries, image embedding and screen captures etc.

Worse though I noticed that the control was not properly firing LostFocus and GotFocus events. I need to trap these in order to enable and disable the editing toolbar. The control would fire LostFocus only the first time after the controls was loaded, but if you come back into the control and click on another control a second time LostFocus no longer fires. Aaaaargh!

I tried for hours to find a workaround including messing the the IE document events, but the problem is that the control host is not passing the state change events between the container and the IE document and there's nothing I could do to fix this.

Gave it some thought and tried the plain Web Browser control with DesignMode="on", and lo and behold the problem does not exist there. In fact, the plain Web Browser control feels much smoother rendering and handling the document in general - typing is a bit snappier, and the various odd focus problems I was previous having have all but disappeared.

Help Builder manipulates the editor almost entirely through the MSHTML DOM so making the switch to the Web Browser Control from the DHTML Editing control wasn't too much of an inconvenience. However there are a couple of nasty behavior changes.

Document Loading
It took me a while to properly get the document into designMode and ready for editing. It turns out that setting:

oEditor.Document.DesignMode="on"

actually reloads the document and you need to again wait for the document complete loading. The code for this requires two wait loops:

*** Write it out to disk so we can navigate to it - empty document File2Var(lcUrl,lcHTML) loEdit.Navigate(lcUrl) *** Wait for document to load lnSeconds = SECONDS() DO WHILE TYPE("loEdit.Document.Body.innerHtml") # "C" AND ; SECONDS() - lnSeconds < 3 DOEVENTS ENDDO loEdit.Document.Designmode = "on" *** Document reloads for designmode lnSeconds = SECONDS() DO WHILE TYPE("loEdit.Document.Body.innerHtml") # "C" AND ; SECONDS() - lnSeconds < 3 DOEVENTS ENDDO

Took a while to understand while loEdit.Document.Body.innerHtml was valid and then all of a sudden disappeared again <g>...

BaseUrl
The other problem is the base URL of the document. The DHTML editor has a baseURL that fixes up URLs. I'm not quite sure how it does it but it apparently captures the base url and then fixes up the document whenever you retrieve any data from it.

The plain WebBrower control doesn't do this and IE in its ever helpfull manner forces all URLs into fully qualified URLs. So:

<a href='html/mypage.html'>

would turn into:

<a href='file:///d:/wwapps/help/blogentries/html/mypage.html'>

Now when you try to edit this link loLink.Href returns that full path which is not really what you want. Instead I want the original shortpath. In addition the document's innerHTML also get returned with these baseURL adorned links.

I played around with providing a Base URL in the document itself, but this still didn't stop IE from returning the URLS as fully qualified. In the end my solution was the brute force way of fixing up the URL with this function:

************************************************************************ FUNCTION FixBasePath(lcElementPath, loCtl) **************************** *** Function: Fixes the base path of a URL by stripping the *** base path from a URL making it effectively relative *** path. Used in the DHTML Editor. *** Return: HTML string ************************************************************************* LOCAL lnAt, lcPathOnly *** Strip to last / lnAt = RAT("/",loCtl.Document.location.toString()) lcPathOnly = SUBSTR(loCtl.Document.location.toString,1,lnAt ) RETURN STRTRAN(lcElementPath,lcPathOnly,"",1,-1,1)

I then use this function to clean up the HTML that gets stored with the topic:

lcValue = this.parent.oHtmlEdit.Document.Body.innerHtml
...
IF lcValue # "<RAWHTML" lcValue = [<RAWHTML Editor="Html">] + CHR(13)+CHR(10) + lcValue + CHR(13) + CHR(10) + "</RAWHTML>" ENDIF this.Value = FixBasePath(lcValue,this.Parent.oHtmlEdit.Document.Body)

Along the same lines everytime an image or hyperlink is edited we need to translate the value to a relative path first. Here's the HyperLink editing code from the context menu:

*** Edit Hyperlink pcURL = loCtl.Href IF pcURL = "file://" pcUrl = FixBasePath(pcUrl,loCtl) ENDIF pcText = loCtl.innerText
 
 
*** Hyperlink Dialog plNewWindow = !EMPTY(loctl.Target) plCancel = .F. loForm = CREATE("wwHRefDialog",pcURL,pcText) loForm.lNewWindow = !EMPTY(loctl.Target) loForm.SHOW() IF !plCancel loCtl.Href=pcUrl loCtl.innerText = pcText IF plNewWindow loCtl.Target="_top" ELSE loCtl.Target = "" ENDIF ENDIF *** End HyperLink Dialog

Overall not too bad, but this fixup also has a few side effects (although I think they also existed with the DHTML control: Help Builder can View Source and tries to keep find the location in the HTML source that matches the current element. It does this basically by looking at the active element, extracting it's outer HTML and then trying to find it in the source code string, setting SelStart and SelLength to select the text:

*** Display the textbox and read data from Design Mode THIS.SaveHTMLToText() *** Try to preselelect the text in the edit control LOCAL loRange, lnAt, lcSelectedHtml loEdit = THIS.parent.oHTMLedit loRange = loEdit.Document.Selection.CreateRange() loParent = null IF VARTYPE(loRange) = "O" TRY loParent = loRange.parentElement CATCH FINALLY IF !ISNULL(loParent) lcSelectedHtml = FixBasePath(loParent.outerHtml, loEdit.Document.Body) lnAt =ATC(lcSelectedHtml,this.Value) IF lnAt > 0 this.SelStart = lnAt this.SelLength = LEN(lcSelectedHtml) this.SetFocus() ENDIF ENDIF ENDTRY ENDIF this.SetFocus()

Works in almost all situation. Note the call the FixBasePath to ensure that the selection we're making matches what we're saving so this works even with fixed up link tags. This is pretty cool.

Now it'd be real nice if I could figure out a way to select text in the HTML document from where the cursor sits in the source code! But I'll leave that excercise for another day.

In the meantime I'm eating my own dogfood here. The editor works pretty well, but this control definitely has its quirks.

Something that I wish I could fix don't know how to:

Importing RTF text directly into HTML
This relates specifically to pasting source code from VS.NET or Visual FoxPro into the edit control. LIke Outlook, Outlook Express etc. this doesn't work real well - the coloring comes in properly but the control looses the formatting. The current approach I use is to prompt for a language to select and import from that.

Better CRLF handling
The way the control handles line spacing is really very non-intuitive. Everything ends up being double spaces and going back often times wipes out the previous page.

HTML Source formatting
It'd be really nice if I could figure out some way to format the HTML nicely instead of whatever mess that IE generates from typed text. Actually Help Builder does some minimal fix up injecting a few linebreaks between table tags, divs etc. But it'd be real nice if the document could be pretty formatted. OTOH, this could be problematic too especially if you formatted text like the source code above - how in the world would you know to leave that alone? Anything between a PRE tags stays as is. I've done enough manual HTML parsing in Help Builder to last me a life time (there's more if in the topic renderer).

 

When I get a little more time I'll post some more detail on how Help Builder HTML Editing works along with some of the event handling code. This won't be fully generic


The Voices of Reason


 

# RE: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Look into the SgmlReader class (on GotDotNet). Take the inner HTML from your editor and run it through the SgmlReader into an Xml document. Then, you have well-formed XML you can work with and do whatever you want to it.

I created a stylesheet once that converted it from XHTML-Transitional (what you get when you use SgmlReader to XHTML-Basic (for mobile devices). That stripped the garbage away and left it nice and clean.

Mike McCann
February 14, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Any chance of "stealing" the html editor out of the .NET IDE or the ASP.NET stuff? We have been using MS InterDev 6.0 for our Help Builder files because it is a pretty nice editor and gives us source control. Someday I hope to see both in WW-HHB :)

Thanks

Rick Strahl
February 14, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Mike, the editor is basically the IE editor on stereoids. I've tried to see if there was some way to embed those (also the frontpage editor), but it's not really possible.

As far as it goes the current VS visual editor is pretty crappy and only a slight step above what the current Help Builder editor does. FrontPage on the other hand is lightyears above that. I have a whole bunch of cool stuff I wanted to do like document map drill downs etc. but it's all so flakey with the control.

I really, wish Microsoft would give us better support for HTML editing. This is fairly important in today's applications since so much content is stored in HTML based markup. I probably will spend more time with the editor code I have now to make it more usable but it's a huge time sink and not really something I'm great at <g>...

Steve Trefethen
February 18, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Hi Rick, I just thought I'd add that in order to get URL's to load correctly you need to use IPersistStreamInit and IMoniker. It's a pretty long story but here is a URL to a newsgroup thread where the problem and it's solution is discussed (watch for wrapping etc.): http://groups-beta.google.com/group/microsoft.public.inetsdk.programming.webbrowser_ctl/browse_frm/thread/93096f2a16980d58/126ef49391662171?q=trefethen+ipersiststreaminit&_done=%2Fgroups%3Fq%3Dtrefethen+ipersiststreaminit+%26hl%3Den%26lr%3D%26client%3Dfirefox-a%26rls%3Dorg.mozilla:en-US:official%26sa%3DN%26tab%3Dwg%26&_doneTitle=Back+to+Search&&d#126ef49391662171

Hope this helps in some way!

Nick's Delphi Blog
April 20, 2005

# re: Crappy HTML


Tristan Leask
September 13, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Hi Rick,

Did you ever figure out a way to get round the double spaced lines when hitting the return key? Cheers mate!

Rick Strahl
September 13, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Tristan,

When you press enter IE inserts a <p> tag. It can't be changed. What you could do (maybe) is trap keystrokes, capture Enter, then send Shift Enter instead (which inserts a <br>). That's as close as you're going to get I think...

Steve Trefethen
September 26, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Hi Rick,
Actually, you can change what gets inserted when you press the Enter key. For more details look for DOCHOSTUIFLAG_DIV_BLOCKDEFAULT. Basically, you can change it to use <div> tags rather than a <p> tag. This constant is used in the IDocHostUIHandler::GetHostInfo call.

-Steve

Tristan Leask
October 05, 2005

# re: Rethinking my HTML Editing - switching to plain Web Browser Control from DHTMLEdit

Yep, got this solved now...

On our html control when ever a value is loaded into it, i stick a <div> at the begining and then any enter keystrocks comeout as just the single line kind.

Masterfull

West Wind  © Rick Strahl, West Wind Technologies, 2005 - 2024