Fetching, managing, and converting content
The modules in the content directory provide the infrastructure for fetching
data, managing it in memory, and converting it for display.
The data related to each URL used by NetSurf is stored in a 'struct content'
(known as a "content"). A content contains
* a 'content type' which corresponds to the MIME type of the URL (for example
CONTENT_HTML, CONTENT_JPEG, or CONTENT_OTHER)
* a status (for example LOADING, DONE, or ERROR)
* type independent data such as the URL and raw source bytes
* a union of structs for type dependent data (for example 'struct
Contents are stored in a global linked list 'content_list', also known as the
The content_* functions provide a general interface for handling these
structures. They use a table of handlers to call type-specific code
('handler_map'). For example, content_redraw() may call html_redraw() or
nsjpeg_redraw() depending on the type of content.
Each content has a list of users. A user is a callback function which is sent a
message (called) when something interesting happens to the content (for example,
it's ready to be displayed). Examples of users are browser windows (of HTML
contents) and HTML contents (of JPEG contents).
Some content types may not be shared among users: an HTML content is dependent
on the width of the window, so sharing by two or more windows wouldn't work.
Thus there may be more than one content with the same URL in memory.
The status of a content follows a fixed order. Certain content functions change
the status, and each change of status results in a message to all users of the
- content_create() creates a content in status TYPE_UNKNOWN
- content_set_type() takes a content TYPE_UNKNOWN to one of
* LOADING (sends optional MSG_NEWPTR followed by MSG_LOADING)
* ERROR (sends MSG_ERROR)
- content_process_data() takes LOADING to one of
* LOADING (no message)
* ERROR (MSG_ERROR)
- content_convert() takes LOADING to one of
* READY (MSG_READY)
* DONE (MSG_READY, MSG_DONE)
* ERROR (MSG_ERROR)
- a content can move from READY to DONE by itself, for example HTML contents
become DONE when all images are fetched and the document is reformatted
- content_stop() aborts loading of a READY content and results in status DONE
The type-specific functions for a content are as follows (where 'type' is
replaced by something):
type_create():: called to initialise type-specific fields in the content
type_process_data():: called when some data arrives. Optional.
type_convert():: called when data has finished arriving. The content needs to be
converted for display. Must set the status to one of
CONTENT_STATUS_READY or CONTENT_STATUS_DONE if no error occurs.
Optional, but probably required for non-trivial types.
type_reformat():: called when, for example, the window has been resized, and the
content needs reformatting for the new size. Optional.
type_destroy():: called when the content is being destroyed. Free all resources.
type_redraw():: called to plot the content to screen.
type_redraw_tiled():: called to plot the content tiled across the screen.
type_stop(): called when the user interrupts in status CONTENT_STATUS_READY.
Must stop any processing and set the status to CONTENT_STATUS_DONE.
Required iff the status can be CONTENT_STATUS_READY.
type_open(): called when a window containing the content is opened. Probably
only makes sense if no_share is set for the content type in
type_close():: called when the window containing the content is closed.
If an error occurs in type_create(), type_process_data(), type_convert(),
CONTENT_MSG_ERROR must be broadcast and false returned. Optionally use
warn_user() for serious errors. The _destroy function will be called soon after.
Each content structure is allocated using talloc, and all data related to a
content should be allocated as a child block of the content structure using
talloc. This will ensure that all memory used by a content is freed.
Contents must keep an estimate of non-talloc allocations in the total_size
attribute. This is used to control the size of the memory cache.
Creating and fetching contents
A high-level interface to starting the process of fetching and converting an URL
is provided by the fetchcache functions, which check the memory cache for a url
and fetch, convert, and cache it if not present.
The fetch module provides a low-level URL fetching interface.