Parsing good HTML code is difficult. Parsing missformatted HTML
is harder. Code MUST not crash. It MUST repair this HTML page.
Often it must guess what the HTML writer want.
It is complex to be compatible with all release of the HTML standard.
Analyzing some attribut of some tag, is a nightmare.
Take care of '"' and "'" . A good test is to analyze : "\"" .
typedef struct _HtmlTextInfo {
char * title;
char * base_url;
char * base_target;
struct mark_up *mlist;
} HtmlTextInfo;
Return a repair markup list in mlist. Return global info on page, title,
base_url and base_target from the HEAD section.
In most case you MUST call it, analyze the mlist, and put it
in the HTML widget for acting (displaying).
All part of structure is allocated and must be free when not needed.
value = ParseMarkTag(mlist->start, "TABLE", "WIDTH"); if (value) width = atoi(value);If attribut is not specified, it return NULL. If attribut is specified, but has no value, return "". Else return the value associated with the passed mark tag.