summaryrefslogtreecommitdiff
path: root/libhubbub.mdwn
blob: 739072ca419ce733f60813b7cc7fde57f33174c7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
[[!meta title="LibHubbub"]]
[[!meta author="Parwana"]]
[[!meta date="2014-07-21T14:30:43Z"]]


[[!toc]] Following are the places
where Libhubbub still lacks, although it is now reliable at most places.

1\) Element stack size inflates on repeated pushing, without reducing its
size during pop. The proposed solution is to reduce size by 1/2 once the
used proportion falls to below 1/3. A simillar increase to twice it's
size once it's full. An approval from core developers is neccessary
before trying to implement this.

2\) The tokeniser has become very messy & unreadable because of
introduction of script related states. The proposed solution is to have
standalone handlers for each state, But this may mean a significant
increase in code size and redundant code, giving a blow to code
reusability.

3\) The library has been significantly slowed down because it is now
required to store tag Attributes. I have currently stored it on the
context details. But repeated use of strndup to copy attribute strings
during stack push as well as during formatting list push, has severely
slowed down things. However, it works reliably.

4\) Assumption: The client currently doesn't support creation of template
elements. Libhubbub can now properly handle template tags, assuming
template to be equivalent to any other tag. When template creation
support is provided, the only thing to be done would be to incorporate
it into the insert\_element method of the treebuilder.

5\) Handling script tags in SVG mode requires the client to support it
too. The specs are a bit hazy and any input on it would be appreciated:
www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html\#parsing-main-inforeign
\<-look under the script end tag .

6\) I couldn't guess how to find out whether the document is an
iframe-source document. If any inputs can be provided, it would be
helpful enough. :)

7\) The charset detection mechanism previously prescaned the doc upto 512
bytes to find the meta tag. This has been increased it to 1024 bytes,
and this requires approval of the core developers. Also, currently, no
algorithms have been implemented to auto-detect document encoding. If
appropriate sources are provided, I will try implementing those in
Hubbub.

8\) XML violations are a special set of rules to make make the API safe
for the xml pipeline. And Hubbub currently doesn't support it. If the
core developers see this to be neccessary at all, I will try
implementing it. Ref:
<http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#coercing-an-html-dom-into-an-infoset>

9\) Some errors out of my knowledge may have crept into the library.
After all human is all but err :-P

Rupinder Singh Khokhar