Fetching, managing, and converting content ========================================== The modules in the content directory provide the infrastructure for fetching data, managing it in memory, and converting it for display. Contents -------- The data related to each URL used by NetSurf is stored in a 'struct content' (known as a "content"). A content contains * a 'content type' which corresponds to the MIME type of the URL (for example CONTENT_HTML, CONTENT_JPEG, or CONTENT_OTHER) * a status (for example LOADING, DONE, or ERROR) * type independent data such as the URL and raw source bytes * a union of structs for type dependent data (for example 'struct content_html_data') Contents are stored in a global linked list 'content_list', also known as the "memory cache". The content_* functions provide a general interface for handling these structures. They use a table of handlers to call type-specific code ('handler_map'). For example, content_redraw() may call html_redraw() or nsjpeg_redraw() depending on the type of content. Each content has a list of users. A user is a callback function which is sent a message (called) when something interesting happens to the content (for example, it's ready to be displayed). Examples of users are browser windows (of HTML contents) and HTML contents (of JPEG contents). Some content types may not be shared among users: an HTML content is dependent on the width of the window, so sharing by two or more windows wouldn't work. Thus there may be more than one content with the same URL in memory. Content status -------------- The status of a content follows a fixed order. Certain content functions change the status, and each change of status results in a message to all users of the content: - content_create() creates a content in status TYPE_UNKNOWN - content_set_type() takes a content TYPE_UNKNOWN to one of * LOADING (sends optional MSG_NEWPTR followed by MSG_LOADING) * ERROR (sends MSG_ERROR) - content_process_data() takes LOADING to one of * LOADING (no message) * ERROR (MSG_ERROR) - content_convert() takes LOADING to one of * READY (MSG_READY) * DONE (MSG_READY, MSG_DONE) * ERROR (MSG_ERROR) - a content can move from READY to DONE by itself, for example HTML contents become DONE when all images are fetched and the document is reformatted (MSG_DONE) - content_stop() aborts loading of a READY content and results in status DONE (MSG_DONE) Type functions -------------- [[typefunc]] The type-specific functions for a content are as follows (where 'type' is replaced by something): type_create():: called to initialise type-specific fields in the content structure. Optional. type_process_data():: called when some data arrives. Optional. type_convert():: called when data has finished arriving. The content needs to be converted for display. Must set the status to one of CONTENT_STATUS_READY or CONTENT_STATUS_DONE if no error occurs. Optional, but probably required for non-trivial types. type_reformat():: called when, for example, the window has been resized, and the content needs reformatting for the new size. Optional. type_destroy():: called when the content is being destroyed. Free all resources. Optional. type_redraw():: called to plot the content to screen. type_redraw_tiled():: called to plot the content tiled across the screen. Optional. type_stop(): called when the user interrupts in status CONTENT_STATUS_READY. Must stop any processing and set the status to CONTENT_STATUS_DONE. Required iff the status can be CONTENT_STATUS_READY. type_open(): called when a window containing the content is opened. Probably only makes sense if no_share is set for the content type in handler_map. Optional. type_close():: called when the window containing the content is closed. Optional. If an error occurs in type_create(), type_process_data(), type_convert(), CONTENT_MSG_ERROR must be broadcast and false returned. The _destroy function will be called soon after. Memory allocation ----------------- Each content structure is allocated using talloc, and all data related to a content should be allocated as a child block of the content structure using talloc. This will ensure that all memory used by a content is freed. Contents must keep an estimate of non-talloc allocations in the total_size attribute. This is used to control the size of the memory cache. Creating and fetching contents ------------------------------ A high-level interface to starting the process of fetching and converting an URL is provided by the fetchcache functions, which check the memory cache for a url and fetch, convert, and cache it if not present. The fetch module provides a low-level URL fetching interface. Adding support for a new content type ------------------------------------- Addition of support for new content types is fairly simple and the process is as follows: - Implement, or at least stub out, the new content type handler. See the <> section above for details of the type handler API. - Add a type value to the 'content_type' enumeration (content_type.h) - Add an entry for the new type's private data in the 'data' union within 'struct content' (content.h) - Add appropriate mappings in the 'mime_map' table from MIME type strings to the 'content_type' value created. (content.c) - Add a textual name for the new content type to 'content_type_name'. This array is indexed by 'content_type'. (content.c) - Add an entry for the new content type's handler in the 'handler_map' array. This array is indexed by 'content_type'. (content.c) For examples of content type handlers, consult the image/ directory. The JPEG handler is fairly self-explanatory.