summaryrefslogtreecommitdiff
path: root/src/tokeniser
Commit message (Collapse)AuthorAgeFilesLines
* Fixed enumerator entries & name-type mapping. Also fixed the option/optgroup ↵Rupinder Singh Khokhar2014-08-011-0/+2
| | | | tag starting handler. Also emitted on encountering a '<' in script related state. Also fixed the check for special element.
* SCRIPT related states added to the tokeniser. This might still be buggy. All ↵Rupinder Singh Khokhar2014-08-011-16/+393
| | | | the script relted bug patches will be rebased from here. Some states can still be collapsed & the code can still be made more understandable & beautiful ;)
* Handling LF after CR in bogus comment state & changing few tests to make it ↵Rupinder Singh Khokhar2014-07-091-2/+3
| | | | in accordance with the tester interface
* added RAWTEXT contentModel. Also removed an if(c='-') condition because I ↵Rupinder Singh Khokhar2014-07-091-95/+93
| | | | felt it was extranious, with no clear logic, not according to the specs. Also fixed a sever bug in handling the tagname state. In all 3 more test files give a PASS
* Adding PLAINTEXT State & fixing the tester at placesRupinder Singh Khokhar2014-07-091-3/+7
|
* There could have been a better way to handle EOFs in b/w tag-names and ↵Rupinder Singh Khokhar2014-07-091-22/+44
| | | | attribute values. [Fix] Numeric overflow check algo. [fix] cp1252 tables.
* Adding the COMMENT_END_BANG state for test3.datRupinder Singh Khokhar2014-07-091-1/+24
|
* [Fix] tokeniser wrongly emitted a replacement character instead of utf8 ↵Rupinder Singh Khokhar2014-07-091-2/+8
| | | | NULL. Also, the tester used strlen to calculate string lengths--this seg faults if a null is passed-- this is also fixed.
* Updating Named Entities API in tokeniserRupinder Singh Khokhar2014-07-093-50/+70
|
* Remove client allocation function and update for new lpu API.Michael Drake2013-12-142-25/+12
|
* Fix uninitialised pause variableVincent Sanders2012-07-131-0/+2
|
* Add ability to pause tokenisationVincent Sanders2012-07-102-2/+27
|
* Insert data at correct point in input stream.John-Mark Bell2012-07-052-0/+49
|
* Update to new NSBUILD infrastructureDaniel Silverstone2012-06-291-1/+1
| | | | svn path=/trunk/hubbub/; revision=14006
* Fix build with GCC 4.6John Mark Bell2011-07-261-6/+11
| | | | svn path=/trunk/hubbub/; revision=12628
* Fix profile and coverage targetsJohn Mark Bell2010-12-061-1/+3
| | | | svn path=/trunk/hubbub/; revision=11021
* Remove init/final and embed entity trie at build time. r=vinceDaniel Silverstone2010-12-044-2205/+76
| | | | svn path=/trunk/hubbub/; revision=10976
* Make assignment of doctype component pointers clearer. Also removes a ↵John Mark Bell2009-05-271-9/+6
| | | | | | redundant pointer increment. svn path=/trunk/hubbub/; revision=7581
* Remove redundant code.John Mark Bell2009-05-271-13/+0
| | | | svn path=/trunk/hubbub/; revision=7580
* Initialise variables to stop GCC 4.4 complaining (credit: Jeroen Habraken)John Mark Bell2009-05-051-4/+4
| | | | svn path=/trunk/hubbub/; revision=7398
* Improve error handling in the tokeniserJohn Mark Bell2009-04-061-62/+154
| | | | svn path=/trunk/hubbub/; revision=7052
* hubbub_alloc -> hubbub_allocator_fnJohn Mark Bell2009-04-044-7/+9
| | | | svn path=/trunk/hubbub/; revision=7043
* First cut at porting hubbub's buildsystem to the core toolsJohn Mark Bell2009-03-241-44/+2
| | | | svn path=/trunk/hubbub/; revision=6837
* Sync tokeniser tests with html5lib.John Mark Bell2009-03-101-12/+19
| | | | | | | | Sync tokeniser implementation with the spec. Fix handling of \0 in the tag open state. The unicodeCharacters test is disabled, as json-c doesn't like it. svn path=/trunk/hubbub/; revision=6755
* Make doxygen produce API documentation. I guess it helps if you enable the ↵John Mark Bell2009-01-081-1/+1
| | | | | | | | right options. Fix a couple more doxygen warnings. svn path=/trunk/hubbub/; revision=5996
* Use doxygen to create API documentation.John Mark Bell2009-01-083-18/+22
| | | | | | Add a bunch of extra commentary to stop doxygen warning. svn path=/trunk/hubbub/; revision=5994
* Fix potential read beyond available input data when processing \r in some ↵John Mark Bell2009-01-061-5/+5
| | | | | | | | | | states. What happened was that, given \rabc, we would advance past the \r, then read at current_offset + len (len == 1). I.E. read 'b' instead of 'a'. If the data in the inputstream's internal buffer happened to end immediately after the \r, then we'd read past the end of the buffer thanks to a bug in lpu_inputstream_peek which was fixed in r5965. In any case, we'd still be looking at the wrong character when looking for CRLF pairs. All regression tests now pass again. svn path=/trunk/hubbub/; revision=5967
* Port to changed lpu API.John Mark Bell2009-01-061-455/+635
| | | | | | | Drop HUBBUB_OOD and just use HUBBUB_NEEDDATA, instead. Currently aborts in bogus comment handling if it encounters a \r at the end of the inputstream's utf-8 buffer. svn path=/trunk/hubbub/; revision=5966
* Fix build breakageJohn Mark Bell2008-11-301-1/+3
| | | | svn path=/trunk/hubbub/; revision=5851
* lotsa C89, please check.François Revel2008-11-301-48/+91
| | | | svn path=/trunk/hubbub/; revision=5846
* Return errors from tokeniser constructor/destructorJohn Mark Bell2008-11-092-16/+25
| | | | svn path=/trunk/hubbub/; revision=5664
* Return errors from dictionary constructor/destructor.John Mark Bell2008-11-091-3/+3
| | | | | | Fix commentary copied from libcss svn path=/trunk/hubbub/; revision=5663
* Port hubbub to new lpu APIJohn Mark Bell2008-11-081-2/+3
| | | | svn path=/trunk/hubbub/; revision=5656
* Squash memory leakJohn Mark Bell2008-09-081-0/+2
| | | | svn path=/trunk/hubbub/; revision=5285
* Fixes for handling of CR followed immediately by multibyte sequences.John Mark Bell2008-09-061-59/+94
| | | | | | | Pedantic whitespace changes. More paranoia surrounding entity handling. svn path=/trunk/hubbub/; revision=5266
* Fix segfault caused by trampling the length of the current character when ↵John Mark Bell2008-08-181-2/+8
| | | | | | | | testing whether the 4 most recently read characters in the data state are <!--. Add a couple of assertions for paranoia. svn path=/trunk/hubbub/; revision=5146
* Do what r5107 for system ID for public IDs.Andrew Sidwell2008-08-131-14/+4
| | | | svn path=/trunk/hubbub/; revision=5108
* Another COLLECT() -> COLLECT_MS() fix.Andrew Sidwell2008-08-131-14/+4
| | | | svn path=/trunk/hubbub/; revision=5107
* Add page which crashed, and fix the bug that caused it to do so.Andrew Sidwell2008-08-131-4/+2
| | | | svn path=/trunk/hubbub/; revision=5106
* Remove the CHAR() macro, which lets make test run again.Andrew Sidwell2008-08-131-80/+74
| | | | svn path=/trunk/hubbub/; revision=5104
* Optimise COLLECT_MS() macro.Andrew Sidwell2008-08-131-5/+3
| | | | svn path=/trunk/hubbub/; revision=5099
* Fix segfault in elimination of duplicate attributes.John Mark Bell2008-08-131-7/+8
| | | | svn path=/trunk/hubbub/; revision=5098
* Optimise comment states slightly, taking advantage of the fact that buffers ↵Andrew Sidwell2008-08-131-20/+1
| | | | | | store their own length and when emitting the comment, the buffer contains the whole comment and nothing else. svn path=/trunk/hubbub/; revision=5095
* Fix tokeniser so make test passes, with possible perf hit.Andrew Sidwell2008-08-131-18/+43
| | | | svn path=/trunk/hubbub/; revision=5093
* Use COLLECT_MS() macro rather than COLLECT() in attribute values.Andrew Sidwell2008-08-131-4/+4
| | | | svn path=/trunk/hubbub/; revision=5086
* Sanity checking for string dataJohn Mark Bell2008-08-131-0/+39
| | | | svn path=/trunk/hubbub/; revision=5080
* Remember to clear the self-closing flag when emitting a tag token.Andrew Sidwell2008-08-111-0/+3
| | | | svn path=/trunk/hubbub/; revision=5030
* - Remove an unused function from utils/string.cAndrew Sidwell2008-08-111-46/+1
| | | | | | | - Remove the no-op FINISH() macro from the tokeniser - Fix a typo in the charset detector svn path=/trunk/hubbub/; revision=5007
* Move one step closer to getting encoding changes working.Andrew Sidwell2008-08-111-1/+1
| | | | svn path=/trunk/hubbub/; revision=5000
* Propagate more return codes up the chain from the token emitter.Andrew Sidwell2008-08-091-55/+38
| | | | svn path=/trunk/hubbub/; revision=4980