[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US9256584B2 - Rich text handling for a web application - Google Patents

Rich text handling for a web application Download PDF

Info

Publication number
US9256584B2
US9256584B2 US13/948,728 US201313948728A US9256584B2 US 9256584 B2 US9256584 B2 US 9256584B2 US 201313948728 A US201313948728 A US 201313948728A US 9256584 B2 US9256584 B2 US 9256584B2
Authority
US
United States
Prior art keywords
word
rich text
dictionary
text
misspelled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US13/948,728
Other versions
US20130311879A1 (en
Inventor
James R. Wason
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kyndryl Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US13/948,728 priority Critical patent/US9256584B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WASON, JAMES R.
Publication of US20130311879A1 publication Critical patent/US20130311879A1/en
Priority to US15/009,027 priority patent/US10169310B2/en
Application granted granted Critical
Publication of US9256584B2 publication Critical patent/US9256584B2/en
Assigned to KYNDRYL, INC. reassignment KYNDRYL, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Adjusted expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • G06F17/2276
    • G06F17/2247
    • G06F17/24
    • G06F17/273
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/157Transformation using dictionaries or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation

Definitions

  • the present invention generally relates to rich text capability for Web based applications and Web browsers, and more specifically, to a system and method for representing and controlling rich text in memory and various text representations.
  • Web browser based applications are becoming increasingly popular. These browser based applications necessarily handle documents of various types. However, document handling and management of documents as they change over time to include new or varying content can be very expensive and cumbersome. Flexibility in representing and handling documents, including those stored in relational databases, is limited. One specific example of a major drawback is the lack of a robust rich text capability.
  • Standard Web browsers do not provide full feature rich text edit functions. This includes, for example, the general lack of ability to change font face, size and color, underline, bold, italic, to create tables and lists (both ordered and unordered), to check spelling, and to add in-line images or file attachments. Further, images and file attachments typically cannot be added as links to other Uniform Resource Locators (URL), or uploaded from a local file system into Binary Large Object (BLOB) data stored on a server.
  • URL Uniform Resource Locators
  • BLOB Binary Large Object
  • Some known web browsers have features that allow direct editing of hypertext mark-up language (html) features of a page (i.e., the “content editable” feature) which effectively creates a text area that allows limited rich text editing. These browsers, however, do not provide any method to save changes to rich text that have been made through its editing facilities. Most browsers, however, do not provide any rudimentary text or other type of editing features.
  • html hypertext mark-up language
  • the present invention overcomes the problems set forth.
  • a method for managing rich text applications such as Web based applications and browsers.
  • the method comprises representing the rich text in a memory structure representation and providing one or more classes for use by the applications and browsers to create the memory structure representation representative of rich text.
  • the classes include a rich text list class for managing one or more rich text nodes and a rich text class to create rich text nodes that represent a unit of rich text and its attributes.
  • the memory structure representation is used that was created by the provided classes.
  • a method is provided to represent and manage rich text for use by applications and browsers that involves representing the rich text in a memory structure representation and providing classes for use by the application and browsers to create the memory structure representation.
  • a spell checker is additionally provided to facilitate correcting misspelled words.
  • the spell checker utilizes the memory structure representation and the provided rich text classes.
  • the spell checker employs a dictionary wherein each word of the dictionary has a signature associated with the word to facilitate searching for substitute words.
  • an apparatus of the invention provides components for representing and managing rich text for use by the applications and browsers.
  • the apparatus includes a component for representing rich text in a memory structure representation and a component for providing one or more classes for use by the applications and browsers to create the memory structure representation.
  • a component for editing rich text in a document using the rich text classes is provided, as is a spell-checking component.
  • a computer program codes comprising a computer usable medium having a computer readable program code embodied in the medium.
  • the computer program codes include a first computer program code to provide one or more classes for use by applications to at least create and manage one or more rich text nodes in a memory structure representation representative of rich text. Additionally, a second computer program code to represent the rich text in the memory structure representation, and a third computer program code to edit rich text in a document using the memory structure representation to perform editing functions on a document having rich text as managed and created by the one or more classes are provided.
  • FIG. 1 is a block diagram showing an illustrative context of the present invention
  • FIG. 2A is a relational block diagram illustrating various aspects according to the present invention.
  • FIG. 2B is a relational block diagram for a rich text list and rich text nodes according to the present invention.
  • FIG. 2C is a description of possible contents of a rich text node according to the present invention.
  • FIG. 3 is a relational block diagram of table node and subclass nodes according to the present invention.
  • FIG. 4 is a relational block diagram of rich text nodes according to the present invention.
  • FIG. 5 is a functional block diagram showing steps and components involved in creating various types of rich text nodes according to the present invention.
  • FIG. 6 is a functional block diagram showing steps to process a rich text list
  • FIG. 7 is a functional block diagram showing steps and results of processing a table node according to the present invention.
  • FIG. 8 is a functional block diagram showing the results of processing a rich text list according to the present invention.
  • FIG. 9 is a block diagram showing components involved in processing a databody with rich text using an aggregate editor according to the present invention.
  • FIG. 10 is a relational block diagram showing the relationship of components in editing a databody, images or attachments by an aggregate editor and a rich text editor with a browser according to the present invention
  • FIG. 11A is an illustration of a browser screen in browse mode with rich text according to the present invention.
  • FIG. 11B is an example of an edit screen and controls according to FIG. 11A ;
  • FIG. 11C is another example of an edit screen and tool bar controls for editing rich text according to the present invention.
  • FIG. 11D is an example of a browser screen for editing rich text with a browser according to the present invention.
  • FIG. 12A is an example of editing rich text tables and lists according to the present invention.
  • FIG. 12B is an another example of editing rich text tables and lists according to the present invention.
  • FIG. 13A is an example of editing rich text to select or browse a URL according to the present invention.
  • FIG. 13B is an example of editing rich text for images, attachments, or links according to the present invention.
  • FIG. 14 shows a spelling check screen for determining replacement words in a rich text document
  • FIGS. 15A and 15B are flow diagrams show steps of using the present invention to represent rich text in a memory structure
  • FIG. 16 is a flow diagram showing steps of processing text to represent rich text in memory structure
  • FIG. 17 is a flow diagram showing steps of using the present invention from a Web type application.
  • FIG. 18 is a flow diagram showing the steps of providing a spell check function for a rich text document according to the present invention.
  • This invention provides a full feature rich text edit capability for a standard Web browser and other applications.
  • the present invention provides a method and system to consistently represent rich text in memory structure in order to facilitate editing and managing documents containing such rich text.
  • These memory structures may be resident on a computer, server or other known hardware.
  • the documents may include, for example, html documents presented via a web browser or other web based applications. These documents may contain text, tables, images, links and the like in which the system and method of the present invention represents such elements as rich text in such documents.
  • a client computer 1 is provided with a browser having an applet for accessing Web applications typically over a network such as the Internet 2 .
  • a server 3 with servlet is connected to the Internet 2 and a database 4 .
  • the server 3 and associated database 4 provides for a Web based application in communication with the client computer 1 .
  • the browser can be optimized for providing capabilities for any known browser or application. This is achieved by controlling rich text from its memory representation. All other representations such as in a database, html from a Web browser, or any other new potential source such as Rich Text Format (RTF) format, may be mapped to the controlled memory format.
  • the memory format may then be used to create new representations of the rich text for various purposes such as, for example, editing, or to show misspelled words by highlighting, html, plain text, and the like.
  • each rich text field is represented by a controller class (e.g., the rich text class), and subsidiary classes that hold the rich text content.
  • the most basic of these is the rich text node, which represents a single atomic unit of the rich text (i.e., text with its attributes such as font face, font size, underlining, italics, etc.).
  • the rich text node may also have attributes to determine, for example, if the text is bold, underlined, italic, or another attribute may determine if that text node should start a new paragraph. Essentially any text attribute can be represented.
  • FIG. 2A is a relational block diagram illustrating various aspects according to the present invention.
  • FIG. 2A shows a memory structure 100 comprising a rich text list class for controlling the collection of rich text nodes (e.g., RichTextNode in EADRichTextNode class) in various string representations, generally represented as 101 , 102 , 103 , and 104 .
  • the string representations 101 - 104 may include, for example, a long string stored as a Character Large Object (CLOB) 101 in a database (such as a relational database DB2), html representation 102 to display on the Web, plain text 103 to use as the editable text of a rich text editor, and text 104 used for spell checking.
  • CLOB Character Large Object
  • the present invention also provides methods (e.g., JAVA methods, or the like) to access and convert rich text structures from and into various formats.
  • FIG. 2B is a relational block diagram for a rich text list and rich text nodes according to the present invention.
  • one or more rich text nodes 105 which make up the rich text, are controlled by a rich text list class node 106 (e.g., EADPRichTextList).
  • the rich text list class node 106 is a controller class, which contains a top-level list of one or more rich text nodes 105 .
  • These rich text nodes 105 can then be used to start table nodes 107 that eventually point down to other rich text nodes 105 in table cells 108 that include heading and row cells.
  • This nested structure of text nodes and tables may be representative of the general memory structure of the rich text.
  • this rich text list class 106 maintains a list of rich text nodes 105 (e.g., RichTextNode).
  • rich text nodes 105 e.g., RichTextNode
  • representing tables and lists may include nested structures of rich text nodes 105 , table nodes 107 , and table cells 108 .
  • FIG. 2C is a description of possible contents of a rich text node, i.e., RichTextNode class and its memory structure.
  • RichTextNode class is used in conjunction with applications such as Web browsers and the class is instantiated as necessary when used by the applications.
  • rich text contains text (string data) with attributes to control its presentation. These may include for example the font face, font size, font color, and whether or not the text is italicized, underlined, or bold. Segments of text where these attributes are the same are represented as a single rich text node (e.g., the JAVA class EADPRichTextNode).
  • the RichTextNode class of a rich text node 105 may include a few additional properties, such as whether it is at a line break, or whether it starts a table.
  • the text property is used to store the text string for a rich text node. In this case the contents of an html image tag (or xml) are stored in the text property of the rich text nodes.
  • the rich text node can also represent the location of an image or link. In this case it stores all the information need to create the html for that image or link.
  • FIG. 3 is a relational block diagram of table nodes and sub class nodes. Specifically, FIG. 3 shows a table structure generally shown as 120 .
  • the format of the table structure 120 may be represented in memory as a set of special rich text node types including table node 121 , table body node 122 and table header node 123 (for defining table characteristics), table row node 124 , heading cell node 125 and row cell node 126 corresponding to the various types of html tags controlling table representation.
  • each type of node maintains a reference to the nodes it controls for the next level. For example, the table row node 124 controls a list of row cell nodes 126 , and the table body node 122 controls a list of table row nodes 124 .
  • the header cell node 125 and row cell node 126 maintain lists of rich text nodes 105 a , representing the content of those cells.
  • the rich text node 105 a may contain an anchor point to another table node 121 to start a new table at that point in the rich text. This structure allows for nested tables.
  • the rich text is stored as a string in the relational database, and may be stored in a CLOB column due to a potentially large string size.
  • this string can be formatted such as converting the rich text into the html string for storage.
  • Another is to convert into xml. This approach may have some advantages if other applications are able to process the xml directly as it is stored in the relational database.
  • a third alternative which has the advantage of requiring less storage space, is to use a compressed format where the various attributes of each rich text node are captured, along with the text value for that node.
  • the method to convert the rich text to string is similar to the method for generating an html string, except for formatting of each part of the string.
  • the rich text node has the ability to parse a well-formed segment of html and set its attributes accordingly. This includes the ability to create other rich text nodes as needed as the html indicates a change in text attributes or the presence of an image or link.
  • a function in the rich text list takes html that may not be well formed (i.e., non-well formed html), and preprocesses the html to make it recognizable by the rich text nodes.
  • the rich text list also handles creating the nodes for the table structures included within the html.
  • the rich text node has the ability to parse a well-formed segment of html.
  • a well formed segment of html may include, for example:
  • the tags that are of particular interest are table type tags, image and link tags, and the tags for the rich text attributes (e.g., font, italic, bold, underline, break and paragraph tags).
  • a set of these tags can be used to define the attributes for one rich text node.
  • a single rich text node may be represented as:
  • the parsing method for html handles this by creating a structure of rich text nodes using preceding and following node links as shown generally in FIG. 4 .
  • this structure may be very elaborate and may include many children nodes.
  • Three of these nodes 105 a , 105 b , and 105 c are arbitrarily chosen to further illustrate creation of memory structures from html in FIG. 4 .
  • FIG. 5 a block diagram showing steps and components involved in creating various types of rich texts nodes is shown according to the present invention.
  • the block diagram of FIG. 5 (and FIGS. 6 and 7 ) may represent a structure of the present invention, as well as a high level flow diagram showing the steps implementing the present invention.
  • the steps are denoted by each of the structural blocks or within the structural blocks, and may be implemented using a plurality of separate dedicated or programmable integrated or other electronic circuits or devices.
  • a suitably programmed general purpose computer e.g., a microprocessor, microcontroller or other processor device (CPU or MPU), either alone or in conjunction with one or more peripheral (e.g., integrated circuit) data and signal processing devices can be used to implement the invention.
  • a microprocessor, microcontroller or other processor device CPU or MPU
  • peripheral (e.g., integrated circuit) data and signal processing devices can be used to implement the invention.
  • any device or assembly of devices on which a finite state machine capable of implementing the flow charts shown in the figures can be used as a controller with the invention.
  • the steps may equally be implemented on any known medium.
  • the current node 105 b reflects the current attributes of rich text node 105 .
  • the rich text list 106 passes, at step S 1 , well-formed segments of html to the rich text node 105 . (The overall operation of the rich text list 106 will be described in more detail below). Also, the steps of the parsing method of rich text node 105 are shown in relation to the preceding and following nodes which are now produced.
  • the rich text node 105 performs some cleanup, as needed, on the passed html it has been asked to parse as shown at step S 3 .
  • the unparsed html is assigned to the text attribute of the rich text node.
  • the parsing method of rich text node 105 then calls resolveText method at step S 5 to parse the html.
  • the resolve text method of step S 5 extracts tag information from the text attribute, then uses that tag information to set the other attributes in the rich text node by calling the resolveTag method 130 , shown as step S 6 , and then sets the text to the text it parsed without the tag it just extracted.
  • the steps of the resolveTag method 130 includes the following:
  • step S 9 Pass the tag information (the text between the “ ⁇ ” and “>”) to resolve the tag and to set up the tag attributes, shown at step S 9 . If this is an image or link tag, it requires that the attributes are stored in the text. This is the reason for moving the original text to the following node.
  • preceding or following nodes are not null, call resolve tag 130 on them, making the preceding or following node (as appropriate) the current node, which recursively propagates more rich text nodes as necessary to fully represent the rich text.
  • the resolvetag method 130 is relatively straightforward, except for the image tags.
  • the resolveTag method 130 may determine the type of the tag, for ⁇ i>, ⁇ strong>, ⁇ u>, ⁇ p>, or ⁇ br> it simply sets “on” the corresponding boolean attribute.
  • font tags the content of the tag is parsed to determine if it has size, face or color information, and these attributes are set accordingly if they have been specified.
  • Image tags are somewhat more complicated because the rich text editor overloads the file name with other information to set the alt tag, the height, the width, whether the image should float and whether the tag is to be treated as an in-line image, file attachment, or link.
  • the browser If the image size is manipulated within a rich text editor, the browser generates back the resized image with the height and width in a style statement instead of as html tag attributes.
  • a style tag is generated with the float definition. All of this is written to the text attribute of the rich text node (each image tag requires its own rich text node).
  • FIG. 6 shows a block diagram including different structures or steps for processing a rich text list.
  • the rich text list 106 may perform some preprocessing of the html before it passes well formed segments of html to the rich text nodes 105 .
  • step S 10 cleaning up the html by converting some substitution strings back to their original values, and suppressing meaningless tags such as ⁇ /p> is provided.
  • step S 11 html is well formed. If the html has previously passed through rich text processing (e.g., it was generated from a rich text list at one point and then modified by a rich text editor), it will have markers where the rich text nodes were broken out the last time through (these are separated by a ⁇ !% TT % ⁇ comment tag).
  • the incoming text is broken at these markers at step S 12 . While this process makes it more efficient to process html, during rich text editing for example, it is not strictly necessary. It is understood that a parser is capable of handling large chunks of raw html such as would be encountered during conversions from another source, or if a rich text was pasted into the rich text editor.
  • tags that are not of interest at this point are buffered at step S 13 by changing the start end and end brackets to substitution strings. This includes a table and list related tags, which are ignored now and restored later.
  • step S 13 a check is also made to ensure that the tags start and end in the proper order, and each start tag has a matching end tag within the segment. This is performed by bubbling up end tags that do not have matches within that segment, and then eliminating pairs of start and end tags that have no intervening content.
  • the segments are reconstituted into one string, again using the rich text node separator.
  • the table related tags are restored which where ignored previously.
  • the html is broken into segments at the ⁇ table> tags, and then organized into a new rich text list 132 that includes entries that are either simple strings 133 (for rich text node entries) or vectors 134 (for table entries).
  • the list version of resolveFromHtml method 136 is called to process this list.
  • the resolveFromHtml method 136 for the rich text node 106 is called.
  • These nodes may be added directly to the list of rich text nodes attached to the main rich text list 135 .
  • the resolveFromHtml method 140 for that table node 137 creates a new rich text node 138 in the next position in its main rich text list 135 , passing the vector that has the table information.
  • FIG. 7 is a block diagram showing steps and results of processing a table node.
  • the table structure is again generally shown as 120 , and is built in memory by successively resolving the tags through each type of table node, i.e., table header node 170 or table body node 172 .
  • the operations of table node 171 are essentially repeated by any succeeding table node type created by table node 171 , substantially a recursive operation.
  • the table node 171 reads the incoming tag up to the first end tag (>) to strip out its own tag information at step S 16 , then splits the rest at the next tag type, and passes each entry to that type of table node, either table header node 173 or table body node 172 .
  • Table row nodes 174 and row cell nodes 176 are created from the table body node 172 .
  • Heading cell nodes 175 are created from the table header node 173 .
  • the cell type tag nodes i.e., th and td nodes
  • FIG. 8 shows a block diagram showing the results of processing a rich text list.
  • any rich text node 105 If any rich text node 105 has a table node 171 , it calls the toHtml method for that table node (so that the html for that table is added to the resulting html string 182 before the next node in the main rich text list 106 is added).
  • Each node (e.g., 171 and 172 ) in the table structure adds its own tag information to the resulting html and then calls the toHtml method 180 for each of its dependent tags. This process continues until all nodes have been processed.
  • Rich text is stored as a string in a relational database. Because of the potentially large size of this string, it may be stored in a CLOB column. In order to make this as compact as possible, and to reduce the amount of tag information stored as text (this is to make searching less confusing), most of the tag information in each rich text node may be stored in a compressed format. Arrays are kept of the permitted font face and color values, and the index for those entries is stored into the array. Also, other attributes such as bold, italic, underline and whether the rich text node is an image tag are boolean attributes, and what is stored from them is a null string for false and a one byte string for true. The table nodes are stored in their html tag format, except that the cell nodes may use the relational format for their rich text nodes.
  • Databody fields can be stored in string, date, or numeric format and comprehensively represent the document contents.
  • Rich text is an added type for the databody field that is stored in string format.
  • An aggregate editor which is capable of manipulating and editing a databody, recognizes the rich text type, and has a rich text list as one of its attributes to hold the memory representation of the rich text. This is converted into the string format for the relational database and assigned to the column that holds string values.
  • FIG. 9 is a block diagram showing components involved in processing a databody with rich text using an aggregate editor according to the present invention.
  • the rich text structure is stored in a relational database according to aspects of the invention, it is retrievable for use such as editing and updating.
  • a databody field e.g., 186
  • an aggregate editor 185 may retrieve the rich text string 186 from the column for string values 187 in the relational database 188 and convert that string into memory representation 189 using a toDb2 method 188 in its rich text list attribute.
  • the toDb2 method 188 follows the same pattern as the toHtml method 180 described previously. A difference is that the string may be split into rich text nodes, so that the toDb2 method 188 for each rich text node 105 does a simple conversion of its portion of the string into corresponding attributes.
  • a particular consideration is the presentation of image tags that are BLOB references. These are modified to assure that the URL for the servlet is the current one. This is done in the memory representation of the rich text list. Each of its rich text nodes is checked to see if it is an image node representing a BLOB reference, and if so, the servlet portion of the URL is modified to match the current URL.
  • FIG. 10 is a relational block diagram showing the relationship of components in editing a databody and the like with a browser according to the present invention.
  • the type of a databody field e.g., 193 a
  • that field is presented as read-only with a link above it so that when clicked by a user allows editing of the field.
  • the link may be to a JAVASCRIPT method that brings up a rich text-editing window (i.e., a new browser window).
  • This new window includes hidden html fields (i.e., hidden input fields) which contain the keys needed to process the field when edited (i.e., session key, manager key for the databody application class, row number of that databody field within the databody lists, etc.).
  • This new window also passes the rich text converted into html using a resolveFromHtml method 191 for the rich text list attribute of the databody aggregate editor 185 rich text list 106 .
  • the rich text editor 190 may retrieve any images or attachments 191 from a database, shown in part, as a database row 193 , using the servlet class doRichBlob 196 where the servlet is uploaded for parsing out of keys, byte array, etc.
  • the html for the rich text is assigned to a “content editable div” which allows the text to be edited directly.
  • the rich text edit window is a somewhat simple html form.
  • the rich text edit window is a frame.
  • the frame includes two parts, as shown in FIG. 11D , one to edit the rich text as plain text, i.e., frame 210 , using an applet 197 , and a second frame, i.e., frame 211 , to display the resulting rich text as it is edited.
  • the same applet 197 may be used with known editors, but, in embodiments, may remain hidden. Applets are typically client-side JAVA programs that are loaded and run within the framework of a Web browser.
  • the applet 197 may be linked to the html edit window using the LiveConnect feature of JAVASCRIPT.
  • each of the rich text editing functions 208 may call a JAVASCRIPT routine that invokes a function for rich text manipulation, and then passes the revised html to the applet 197 .
  • the applet 197 then processes the html, and writes the output back out to the “content editable div.”
  • the applet 197 uses the html to create a rich text list structure in its memory, and then converts that rich text structure back into html. This cleans up the html and makes it well formed. In the case of image tags inserted into the rich text by the rich text editor 190 , the applet 197 does a great deal more.
  • the EADP rich text classes There are several functions in the EADP rich text classes to support the plain text editing of the rich text.
  • One is a method on all the rich text nodes to render them into plain text.
  • a simple rich text node is rendered to plain text, its text is written to the output string, along with a one byte separator (a non-editable break character).
  • the latter serves as a reminder that the plain text is really a representation of rich text, and also makes it easier to parse updates to the plain text representation to render it back into rich text.
  • the rich text node is an image node it reports itself in the plain text representation as an image or link. If it is the anchor point of a table node, it reports itself as a table. Note that the content of the table consists of titles and data cells, which are themselves rich text nodes, so it is possible to edit the table by editing its plain text representation.
  • FIGS. 11A-11D illustrate screen shot examples of rich text in browse and edit mode.
  • FIGS. 1B-11D show screen shots in edit mode showing various edit selections 208 including in the body of the browser ( FIG. 11B ) and a tool bar ( FIG. 11C ).
  • Another feature of the present invention is the ability to determine cursor position and selected text within the rich text node.
  • the text area in the applet 197 is able to report the cursor position and the start and end of selected text in the plain text representation. This is then interpreted to determine which parts of text and in which rich text nodes have been selected. Since text selection is typically related to a change in font characteristics, the text node may need to be split to allow the change in face size or color.
  • Rich text editing functions of some browsers implementing the present invention provide two basic types of functions.
  • the first is a variety of ways to change the font and text characteristics (this includes font face, font size, font color, bold, italic, and underlining).
  • the second is the ability to insert an image at the current cursor position by specifying the local file name for that image.
  • the third is the ability to indicate selected text through use of the insert link tag by specifying a special URL for the link that indicates the advanced function to perform.
  • the advanced features of the rich text edit function are built on extensions of the image and link tag facilities.
  • the native function of the browser may be used to create an image or link tag with a file name or URL that is overloaded with additional parameters. This is then intercepted by JAVASCRIPT functions or the hidden applet 197 , and used to provide additional features.
  • EADP-based rich text editing of the present invention allows insertion of table structures and lists into the rich text area.
  • the button labeled “ListsAndTables” ( FIG. 11B ) (or the equivalent icons) invokes the image insertion function in the browser, but with a file name of “table”.
  • the hidden applet 197 intercepts the generated html, it first creates a rich text structure from the passed html, and then looks for an image tag with file name of “table.” If one exists, it brings up a frame (or panel) 212 a and 212 b that allows creation of tables and lists as shown in FIGS. 12A and 12B .
  • the options available from these frames 212 a and 212 b depend on where in the rich text it is invoked.
  • FIGS. 13A and 13B when the “Attachments” button 216 of FIG. 13B (or equivalent icon) is pressed, this invokes a JAVASCRIPT function that brings up a new html window (panel) 215 to process images, attachments, and links.
  • This panel 215 allows selection of whether to process the file or URL as an image, attachment, or link as shown by 216 .
  • the source can be either a local file or an existing URL.
  • a new html window is opened (not shown) to allow selection of the URL when the Browse URL button 217 is pressed.
  • the file button 218 ( FIG. 11A ) on the browser tool bar invokes the standard input of type file provided by all Web browsers. This allows the file contents to be uploaded to the server.
  • this html window 215 is opened, the keys to the current text being edited are added as hidden input fields (e.g., the session key, the manager key, and the databody row number).
  • hidden input fields e.g., the session key, the manager key, and the databody row number.
  • this information along with the file name is used to create a new entry for the file contents in the BLOB table in the relational database on the supporting server. This data is uploaded and stored immediately to avoid problems in a clustered server environment (i.e., it is typically too expensive in a clustered environment to attempt to try to store the BLOB contents in session memory).
  • a URL e.g., Select URL button 219
  • This panel 215 allows the addition of a great deal more formatting of data for the image or attachment. This includes aspects that are needed for well formed and accessible html such as the alt tag, the size of the image, and whether it should float. All this may be added to the file name that is assigned to the image tag.
  • the OK button is pressed, the file is uploaded if need be, and the image creation function on the parent panel is called. This adds the image tag with the overloaded file name to the html, and invokes the applet 197 to intercept and resolve the html. The applet 197 then creates the rich text structure in memory from the passed html. When it processes each image tag, it resolves the file name by parsing out any information that was added as an overload. This additional information is used to set additional parameters in the image tag, to change the image tag to represent a file attachment, or to indicate that the image tag should write itself out as a simple link, for example.
  • the spell checking solution is optimized for use within a servlet environment.
  • Servlets are typically server-side JAVA programs that are loaded and run within the framework of a web server.
  • the dictionary functions all reside, preferably, on the server side, and reside as singletons in server memory so that they are extremely fast.
  • the returned html includes all misspelled words and possible replacements so that JAVASCRIPT functions on the client side can provide an interactive and responsive spelling correction.
  • the technique for dictionary creation and usage is also unique to this invention.
  • the spelling dictionary may be created initially from word lists then instantiated and serialized.
  • the serialized hashtable is held as property files in the JAVA code for the EADP (or equivalent) dictionary class (e.g., EADPSpellCheckController).
  • the structure of the dictionary is a hashtable, where the entries are lists of words. The keys to these entries are unique and provide powerful search ability.
  • each word is assigned a set of characteristic signatures. These characteristics can be simplified or enriched depending on the capabilities of the server holding the dictionary.
  • the possible sets of signatures are:
  • one signature is the first half of the word.
  • each word may be added to the list keyed by each of its signatures. Also, each word has a primary signature, its first three or four letters (or the entire word if it is short). A word is checked for correctness initially by determining if it is a member of the word list for its primary signature. If a word is not correctly spelled, replacements are determined by using all its signatures to find the words in the list for those signatures.
  • a word When a word is checked for correctness, it is first checked to see if it is present in the list for its primary signature. If it is not there, then it is not spelled correctly. In this case, a substitution list is created for the word. That consists of creating a set of signatures for the misspelled word, finding all the words in the lists keyed by those signatures, and then selecting the twenty best matches (ranked as described next) to the word in question.
  • the ranking is accomplished by creating a common list of all the potential replacements. Each word only appears once in the common list, although it may have been found in more than one on the signature lists. Each word gets a score representing how many times it appeared on a signature list.
  • the top fifty (or other predetermined number) matches are selected based on this score. This is done by adding all words with a score of eight to the list of fifty, then all the ones with a score of seven and so on until fifty words are on the top fifty list. A consideration is made that if the match score is less than three, an additional criterion (e.g., whether the length of the replacement word is within two of the length of the misspelled word) is used for the selection.
  • an additional criterion e.g., whether the length of the replacement word is within two of the length of the misspelled word
  • the next filter is to find words in the top fifty list that match first or last parts of the misspelled word.
  • the length to match starts at the length of the misspelled word minus one, and is successively decreased.
  • the words on the top fifty list that match for the length are added to the top twenty list, until it is filled. This provides a list of twenty (or possibly another size) replacements that has the most likely replacements at the top.
  • the EADPRichTextNode class includes a toSpellHtml method, which invokes the dictionary function for each word in its text attribute. If the node is an image tag or table anchor node, the toSpellHtml method returns the standard html for that node.
  • the table nodes also have toSpellHtml methods that just invoke toHtml.
  • the EADPRichTextList toSpellHtml method invokes the same method on each of its rich text nodes, which in turn cascade the method through the rich text structure.
  • the resulting html string has the misspelled words and their replacements isolated by special separator tags.
  • the font tags for the rich text node are repeated for each segment of text outside of the misspelled word.
  • the spell check button (e.g., FIG. 11B ) When the spell check button (e.g., FIG. 11B ) is pressed on the rich text edit panel, it submits a request to the server to convert the rich text to “spell html” format, and bring up the html for the spell check panel 220 of FIG. 14 .
  • the panel 220 is assigned the spell check version of the html as a hidden input field.
  • the panel 220 has an area to display the rich text 221 , a text area 222 to display the current misspelled word or its correction, an option list of possible corrections 223 , and two buttons.
  • the “Correct It” button 224 replaces the current misspelled word with whatever is in the text area (this could be the original spelling, a choice from the option list, or a manually typed in replacement) and moves on to the next word.
  • the “Done” button 225 terminates spell check and moves back to the rich text edit panel.
  • FIG. 18 shows the steps of providing and using a spell check function for a rich text document that starts at step 460 .
  • a spell check option is presented for a user to select a spell check function to locate a replacement word for a document with rich text.
  • the dictionary is initialized so that each word in the dictionary has at least one signature to facilitate searching and retrieval of possible alternate substitutions for misspelled words.
  • creation of at least one signature for each word is accomplished by extracting one or more letters from the dictionary word and combining them to form the signature. This extraction and combination is performed according to the previously described alternatives.
  • a word of a document is determined not to be in the dictionary (i.e., void entry), then at step 480 , at least one signature associated with the misspelled word is created so that at step 485 , the dictionary can be searched using the signatures created in step 480 , and are associated with the misspelled word, as keys to locate possible replacement or substitution word(s) in the dictionary.
  • one or more lists of possible word substitutions in reply to a prior request of the user are presented.
  • substitution of a word in the rich text document is performed while honoring the attributes of the original word that is replaced. This substitution is performed using classes and methods associated with the spell checker that makes use of, and is in harmony with, the rich text memory structure representation described previously. The process completes at step 496 .
  • JAVASCRIPT functions that are unique to the present invention. These functions allow the spell check html to be presented and manipulated. Within the spell html, each misspelled word and its substitution list is isolated from the rest of the html by a separator string. That is, the spell html is split at these separators resulting in an array of strings where some of the entries are regular html and others are the misspelled words with the possible replacements separated by a different separator string. The next JAVASCRIPT function now glues this array back into html to present in the rich text area, with the regular html added.
  • the array entries for the misspelled words are added by creating a font tag with a gray background in its style (to highlight the misspelled word) and Courier font, for example.
  • the misspelled word is added, and an end font tag.
  • the first misspelled word is assigned to the text area for the replacement, and its replacement list is parsed out and assigned to the option list.
  • the “Correct It” button is pressed, the replacement string for the misspelled word is merged into the regular html, and the entire process is repeated (the “next” misspelled word is now the first, so the effect is to work down through the misspelled words).
  • the “Done” button is pressed, all remaining misspelled words are merged back into the surrounding html and the corrected html string is submitted back to the server, which then assigns it to rich text edit panel.
  • FIGS. 15A-17 may represent a high level block diagram implementing the steps of the present invention.
  • the steps of using aspects of the present invention starts at step 300 and continues with representing rich text in a document in a memory structure representation as shown at step 305 .
  • one or more classes are provided for use by Web based applications and browsers to create the memory structure.
  • the rich text class and rich text list class are instantiated, as necessary, by any associated program.
  • editing the rich text in a document using the rich text classes is performed.
  • well-formed segments of text e.g., xml or html
  • This well-formed text is then parsed at step 330 and any unparsed text is assigned to the current node's attribute at step 335 .
  • resolution of the current rich text node's text attribute is performed by extracting tag information and setting attributes in the rich text node.
  • some substitution strings are converted back to original values.
  • certain tags are suppressed (e.g., not relevant tags) by changing the starting and ending tags to substitution strings.
  • segments are reconstituted into one string and table related tags are restored at step 365 .
  • New rich text nodes are organized at step 370 by breaking segments at table tags and entries of a vector or a string are added as appropriate to the segments.
  • FIG. 16 shows steps of creating a rich text memory structure from text (e.g., resolveHtml method) starting at 375 .
  • text is read until a tag (e.g., a first tag) is detected. If the text is a non-null string, the current rich text node is cloned to make a preceding rich text node and assign all text before the tag (i.e., the non-null string) (step 385 ).
  • a determination is made as to whether a string is null. If no text or tags is found, then the string is null and the process terminates at step 392 .
  • a determination is made as to whether tag is a link or image tag.
  • the current node is cloned to make a following node and text after the tag is assigned to the following node (step 400 ).
  • the processing will then continue with step 415 .
  • a check is made whether the first tag has a matching end tag at step 405 . If there is no matching end tag, at step 410 , the current rich text node is cloned to make a following node and any text after the end tag is assigned to clone. Then, the text after the end tag is removed.
  • the information between the first tag and matching end tag is resolved (e.g., resolveTag method) and any text after the tag is removed.
  • the information between the first tag and the matching end tag is resolved to set up attributes in the current node.
  • FIG. 17 shows the steps of using the present invention with interactions through a browser application or the like starting at step 425 .
  • a response to a request is made for editing a document containing rich text. Rich text editing controls are presented for editing the document at step 435 , as a response to the request.
  • changes are accepted to the document using the rich text class and rich text list class for editing. If a request for spell checking is made, the request is recognized and a response generated, at step 445 .
  • a spell check panel is presented that displays spelling alternatives to a misspelled word. Upon selection of a substitution, a spelling substitution is accepted and entered into the rich text document using the rich text classes provided by this invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and apparatus for representing and controlling documents including rich text for Web based applications and browsers is provided so that editing of rich text can be facilitated within the browsers. The rich text is represented in a memory structure so that various formats may be flexible maintained. Text, images, tables, links and the like are represented in the memory structure, which may be maintained in databases for eventual editing. A controller class and subsidiary classes represent the rich text and provide methods to convert html to the memory structure and back, representing the rich text in a relational database, retrieving the rich text from a relational database, and presenting the rich text for editing. A spell checking facility for the rich text is included.

Description

CROSS REFERENCE TO RELATED APPLICATION
The present application is a continuation application of co-pending application Ser. No. 12/940,462, filed on Nov. 5, 2012, its contents being incorporated by reference in its entirety herein.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to rich text capability for Web based applications and Web browsers, and more specifically, to a system and method for representing and controlling rich text in memory and various text representations.
2. Background Description
Web browser based applications are becoming increasingly popular. These browser based applications necessarily handle documents of various types. However, document handling and management of documents as they change over time to include new or varying content can be very expensive and cumbersome. Flexibility in representing and handling documents, including those stored in relational databases, is limited. One specific example of a major drawback is the lack of a robust rich text capability.
Standard Web browsers do not provide full feature rich text edit functions. This includes, for example, the general lack of ability to change font face, size and color, underline, bold, italic, to create tables and lists (both ordered and unordered), to check spelling, and to add in-line images or file attachments. Further, images and file attachments typically cannot be added as links to other Uniform Resource Locators (URL), or uploaded from a local file system into Binary Large Object (BLOB) data stored on a server.
Some known web browsers have features that allow direct editing of hypertext mark-up language (html) features of a page (i.e., the “content editable” feature) which effectively creates a text area that allows limited rich text editing. These browsers, however, do not provide any method to save changes to rich text that have been made through its editing facilities. Most browsers, however, do not provide any rudimentary text or other type of editing features.
The present invention overcomes the problems set forth.
SUMMARY OF THE INVENTION
In an aspect of the present invention, a method is provided for managing rich text applications such as Web based applications and browsers. The method comprises representing the rich text in a memory structure representation and providing one or more classes for use by the applications and browsers to create the memory structure representation representative of rich text. The classes include a rich text list class for managing one or more rich text nodes and a rich text class to create rich text nodes that represent a unit of rich text and its attributes. When editing rich text in a document, the memory structure representation is used that was created by the provided classes.
In another aspect, a method is provided to represent and manage rich text for use by applications and browsers that involves representing the rich text in a memory structure representation and providing classes for use by the application and browsers to create the memory structure representation. A spell checker is additionally provided to facilitate correcting misspelled words. The spell checker utilizes the memory structure representation and the provided rich text classes. The spell checker employs a dictionary wherein each word of the dictionary has a signature associated with the word to facilitate searching for substitute words.
In another aspect, an apparatus of the invention provides components for representing and managing rich text for use by the applications and browsers. The apparatus includes a component for representing rich text in a memory structure representation and a component for providing one or more classes for use by the applications and browsers to create the memory structure representation. A component for editing rich text in a document using the rich text classes is provided, as is a spell-checking component.
In another aspect of the invention, a computer program codes comprising a computer usable medium having a computer readable program code embodied in the medium is provided. The computer program codes include a first computer program code to provide one or more classes for use by applications to at least create and manage one or more rich text nodes in a memory structure representation representative of rich text. Additionally, a second computer program code to represent the rich text in the memory structure representation, and a third computer program code to edit rich text in a document using the memory structure representation to perform editing functions on a document having rich text as managed and created by the one or more classes are provided.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
FIG. 1 is a block diagram showing an illustrative context of the present invention;
FIG. 2A is a relational block diagram illustrating various aspects according to the present invention;
FIG. 2B is a relational block diagram for a rich text list and rich text nodes according to the present invention;
FIG. 2C is a description of possible contents of a rich text node according to the present invention;
FIG. 3 is a relational block diagram of table node and subclass nodes according to the present invention;
FIG. 4 is a relational block diagram of rich text nodes according to the present invention;
FIG. 5 is a functional block diagram showing steps and components involved in creating various types of rich text nodes according to the present invention;
FIG. 6 is a functional block diagram showing steps to process a rich text list;
FIG. 7 is a functional block diagram showing steps and results of processing a table node according to the present invention;
FIG. 8 is a functional block diagram showing the results of processing a rich text list according to the present invention;
FIG. 9 is a block diagram showing components involved in processing a databody with rich text using an aggregate editor according to the present invention;
FIG. 10 is a relational block diagram showing the relationship of components in editing a databody, images or attachments by an aggregate editor and a rich text editor with a browser according to the present invention;
FIG. 11A is an illustration of a browser screen in browse mode with rich text according to the present invention;
FIG. 11B is an example of an edit screen and controls according to FIG. 11A;
FIG. 11C is another example of an edit screen and tool bar controls for editing rich text according to the present invention;
FIG. 11D is an example of a browser screen for editing rich text with a browser according to the present invention;
FIG. 12A is an example of editing rich text tables and lists according to the present invention;
FIG. 12B is an another example of editing rich text tables and lists according to the present invention;
FIG. 13A is an example of editing rich text to select or browse a URL according to the present invention;
FIG. 13B is an example of editing rich text for images, attachments, or links according to the present invention;
FIG. 14 shows a spelling check screen for determining replacement words in a rich text document;
FIGS. 15A and 15B are flow diagrams show steps of using the present invention to represent rich text in a memory structure;
FIG. 16 is a flow diagram showing steps of processing text to represent rich text in memory structure;
FIG. 17 is a flow diagram showing steps of using the present invention from a Web type application; and
FIG. 18 is a flow diagram showing the steps of providing a spell check function for a rich text document according to the present invention.
DETAILED DESCRIPTION OF A DETAILED EMBODIMENT OF THE INVENTION
This invention provides a full feature rich text edit capability for a standard Web browser and other applications. In particular, the present invention provides a method and system to consistently represent rich text in memory structure in order to facilitate editing and managing documents containing such rich text. These memory structures may be resident on a computer, server or other known hardware. The documents may include, for example, html documents presented via a web browser or other web based applications. These documents may contain text, tables, images, links and the like in which the system and method of the present invention represents such elements as rich text in such documents. By utilizing the system and method of the present invention, it is now possible to edit and save such documents in many types of environments thus providing flexibly and robust management and control capabilities. The present invention is described with illustration to the Enterprise Application Development Platform (EADP) developed by International Business Machine Corporation. This environment is shown for illustrative purposes and it should be understood by those of ordinary skill in the art that any other suitable context may be alternatively employed and implemented by the present invention.
System and Structure of the Present Invention
Now referring to FIG. 1, an exemplary environment of the invention is shown. In this exemplary environment, a client computer 1 is provided with a browser having an applet for accessing Web applications typically over a network such as the Internet 2. A server 3 with servlet is connected to the Internet 2 and a database 4. The server 3 and associated database 4 provides for a Web based application in communication with the client computer 1. In an embodiment, the browser can be optimized for providing capabilities for any known browser or application. This is achieved by controlling rich text from its memory representation. All other representations such as in a database, html from a Web browser, or any other new potential source such as Rich Text Format (RTF) format, may be mapped to the controlled memory format. The memory format may then be used to create new representations of the rich text for various purposes such as, for example, editing, or to show misspelled words by highlighting, html, plain text, and the like.
By way of illustration, in memory, each rich text field is represented by a controller class (e.g., the rich text class), and subsidiary classes that hold the rich text content. The most basic of these is the rich text node, which represents a single atomic unit of the rich text (i.e., text with its attributes such as font face, font size, underlining, italics, etc.). The rich text node may also have attributes to determine, for example, if the text is bold, underlined, italic, or another attribute may determine if that text node should start a new paragraph. Essentially any text attribute can be represented.
Memory Structure
FIG. 2A is a relational block diagram illustrating various aspects according to the present invention. In particular, FIG. 2A shows a memory structure 100 comprising a rich text list class for controlling the collection of rich text nodes (e.g., RichTextNode in EADRichTextNode class) in various string representations, generally represented as 101, 102, 103, and 104. The string representations 101-104 may include, for example, a long string stored as a Character Large Object (CLOB) 101 in a database (such as a relational database DB2), html representation 102 to display on the Web, plain text 103 to use as the editable text of a rich text editor, and text 104 used for spell checking. As described below, the present invention also provides methods (e.g., JAVA methods, or the like) to access and convert rich text structures from and into various formats.
FIG. 2B is a relational block diagram for a rich text list and rich text nodes according to the present invention. In this illustration, one or more rich text nodes 105, which make up the rich text, are controlled by a rich text list class node 106 (e.g., EADPRichTextList). The rich text list class node 106 is a controller class, which contains a top-level list of one or more rich text nodes 105. These rich text nodes 105 can then be used to start table nodes 107 that eventually point down to other rich text nodes 105 in table cells 108 that include heading and row cells. This nested structure of text nodes and tables may be representative of the general memory structure of the rich text. At its simplest, this rich text list class 106 maintains a list of rich text nodes 105 (e.g., RichTextNode). However, representing tables and lists may include nested structures of rich text nodes 105, table nodes 107, and table cells 108.
FIG. 2C is a description of possible contents of a rich text node, i.e., RichTextNode class and its memory structure. This RichTextNode class is used in conjunction with applications such as Web browsers and the class is instantiated as necessary when used by the applications. At its simplest, rich text contains text (string data) with attributes to control its presentation. These may include for example the font face, font size, font color, and whether or not the text is italicized, underlined, or bold. Segments of text where these attributes are the same are represented as a single rich text node (e.g., the JAVA class EADPRichTextNode). The RichTextNode class of a rich text node 105 may include a few additional properties, such as whether it is at a line break, or whether it starts a table. The text property is used to store the text string for a rich text node. In this case the contents of an html image tag (or xml) are stored in the text property of the rich text nodes. The rich text node can also represent the location of an image or link. In this case it stores all the information need to create the html for that image or link.
FIG. 3 is a relational block diagram of table nodes and sub class nodes. Specifically, FIG. 3 shows a table structure generally shown as 120. The format of the table structure 120 may be represented in memory as a set of special rich text node types including table node 121, table body node 122 and table header node 123 (for defining table characteristics), table row node 124, heading cell node 125 and row cell node 126 corresponding to the various types of html tags controlling table representation. In embodiments, each type of node maintains a reference to the nodes it controls for the next level. For example, the table row node 124 controls a list of row cell nodes 126, and the table body node 122 controls a list of table row nodes 124. The header cell node 125 and row cell node 126 maintain lists of rich text nodes 105 a, representing the content of those cells. The rich text node 105 a, in turn, may contain an anchor point to another table node 121 to start a new table at that point in the rich text. This structure allows for nested tables.
Most manipulation of the rich text is performed in its memory format as described above. The present invention also provides methods to transform the text from its memory format into the string representations and vice versa. In embodiments, the rich text is stored as a string in the relational database, and may be stored in a CLOB column due to a potentially large string size. Of course, there are alternative ways that this string can be formatted such as converting the rich text into the html string for storage. Another is to convert into xml. This approach may have some advantages if other applications are able to process the xml directly as it is stored in the relational database. A third alternative, which has the advantage of requiring less storage space, is to use a compressed format where the various attributes of each rich text node are captured, along with the text value for that node. For all three alternatives, the method to convert the rich text to string is similar to the method for generating an html string, except for formatting of each part of the string.
Creating Rich Text Memory Structure from Html
In embodiments, there are two aspects of creating rich text memory structures from html. In a first aspect, the rich text node has the ability to parse a well-formed segment of html and set its attributes accordingly. This includes the ability to create other rich text nodes as needed as the html indicates a change in text attributes or the presence of an image or link. In a second aspect, a function in the rich text list takes html that may not be well formed (i.e., non-well formed html), and preprocesses the html to make it recognizable by the rich text nodes. The rich text list also handles creating the nodes for the table structures included within the html.
The rich text node has the ability to parse a well-formed segment of html. A well formed segment of html may include, for example:
    • 1. Plain text outside tags;
    • 2. A tag that does not require an end tag is well formed.
    • 3. If a tag has a corresponding end tag then the content between the start and end tag, and does not contain a tag of the same type; and
    • 4. Tags that are not of interest to the rich text node are suppressed.
The tags that are of particular interest are table type tags, image and link tags, and the tags for the rich text attributes (e.g., font, italic, bold, underline, break and paragraph tags). A set of these tags can be used to define the attributes for one rich text node. For example a single rich text node may be represented as:
<p><i><strong><u><font face=“verdana” size=“3” color=“black”>Hello world<font></u></strong>-;</i>
which looks like
Hello World
(type size is “3” and color is black)
However, suppose the passed html included a font change, located, for example, in the middle:
<p><i><strong><u><font face=“verdana” size=“3′ color=”black>Hello</font><font face=“verdana” size=“5” color=“red”>world</font></u></strong>cz/i&g-t;
which now looks like this
Hello World
(type size of “Hello is “3” color is black while the type size of “world” is now “5”, and color is red)
In the latter scenario, two rich text nodes would be required to process these attributes. The parsing method for html handles this by creating a structure of rich text nodes using preceding and following node links as shown generally in FIG. 4. Depending on the actual html being parsed, this structure may be very elaborate and may include many children nodes. Three of these nodes 105 a, 105 b, and 105 c are arbitrarily chosen to further illustrate creation of memory structures from html in FIG. 4.
Referring now to FIG. 5, a block diagram showing steps and components involved in creating various types of rich texts nodes is shown according to the present invention. It should be well understood that the block diagram of FIG. 5 (and FIGS. 6 and 7) may represent a structure of the present invention, as well as a high level flow diagram showing the steps implementing the present invention. The steps are denoted by each of the structural blocks or within the structural blocks, and may be implemented using a plurality of separate dedicated or programmable integrated or other electronic circuits or devices. A suitably programmed general purpose computer, e.g., a microprocessor, microcontroller or other processor device (CPU or MPU), either alone or in conjunction with one or more peripheral (e.g., integrated circuit) data and signal processing devices can be used to implement the invention. In general, any device or assembly of devices on which a finite state machine capable of implementing the flow charts shown in the figures can be used as a controller with the invention. The steps may equally be implemented on any known medium.
In FIG. 5, the current node 105 b reflects the current attributes of rich text node 105. The rich text list 106 passes, at step S1, well-formed segments of html to the rich text node 105. (The overall operation of the rich text list 106 will be described in more detail below). Also, the steps of the parsing method of rich text node 105 are shown in relation to the preceding and following nodes which are now produced. Once the html is resolved at step S2, the rich text node 105 performs some cleanup, as needed, on the passed html it has been asked to parse as shown at step S3. At step S4, the unparsed html is assigned to the text attribute of the rich text node. The parsing method of rich text node 105 then calls resolveText method at step S5 to parse the html. The resolve text method of step S5 extracts tag information from the text attribute, then uses that tag information to set the other attributes in the rich text node by calling the resolveTag method 130, shown as step S6, and then sets the text to the text it parsed without the tag it just extracted. The steps of the resolveTag method 130 includes the following:
1. Read the text up to the first tag (i.e., the first occurrence of “<”). If this is not a null string, clone the current rich text node 105 b and make the clone a preceding node 105 a (S7), and assign to it all the text before the first tag (i.e., first part). Then remove that part of the text and call the resolvetag method 130 again. The html needs to be well formed for the cloning steps to work recursively. The well formed property ensures that the encountered tags are in the proper order so that the text sent to the clone will not miss any tags.
2. If the tag has a matching end tag, check if there is any text beyond that end tag. If there is, clone the current rich text node 150 b, make that clone the following node 105 c (S8), and assign it the text after the end tag. Then remove that part of the text and call the resolveTag method 130 again.
3. If the tag is an image or link tag, clone the current rich text node 105 b and make that clone the following node 105 c (S8), and assign it the text after the tag (i.e., last part).
4. Pass the tag information (the text between the “<” and “>”) to resolve the tag and to set up the tag attributes, shown at step S9. If this is an image or link tag, it requires that the attributes are stored in the text. This is the reason for moving the original text to the following node.
5. If the preceding or following nodes are not null, call resolve tag 130 on them, making the preceding or following node (as appropriate) the current node, which recursively propagates more rich text nodes as necessary to fully represent the rich text.
The resolvetag method 130 is relatively straightforward, except for the image tags. For other tag types, the resolveTag method 130 may determine the type of the tag, for <i>, <strong>, <u>, <p>, or <br> it simply sets “on” the corresponding boolean attribute. For font tags, the content of the tag is parsed to determine if it has size, face or color information, and these attributes are set accordingly if they have been specified. Image tags are somewhat more complicated because the rich text editor overloads the file name with other information to set the alt tag, the height, the width, whether the image should float and whether the tag is to be treated as an in-line image, file attachment, or link. If the image size is manipulated within a rich text editor, the browser generates back the resized image with the height and width in a style statement instead of as html tag attributes. A style tag is generated with the float definition. All of this is written to the text attribute of the rich text node (each image tag requires its own rich text node). If the image is defined as a link instead of an image, the full link tag (e.g., <a href= . . . > . . . </a>) is placed in the text field.
FIG. 6 shows a block diagram including different structures or steps for processing a rich text list. The rich text list 106 may perform some preprocessing of the html before it passes well formed segments of html to the rich text nodes 105. In step S10, cleaning up the html by converting some substitution strings back to their original values, and suppressing meaningless tags such as </p> is provided. At step S11, html is well formed. If the html has previously passed through rich text processing (e.g., it was generated from a rich text list at one point and then modified by a rich text editor), it will have markers where the rich text nodes were broken out the last time through (these are separated by a<!% TT %→ comment tag). The incoming text is broken at these markers at step S12. While this process makes it more efficient to process html, during rich text editing for example, it is not strictly necessary. It is understood that a parser is capable of handling large chunks of raw html such as would be encountered during conversions from another source, or if a rich text was pasted into the rich text editor.
Still referring to FIG. 6, within each segment html, tags that are not of interest at this point are buffered at step S13 by changing the start end and end brackets to substitution strings. This includes a table and list related tags, which are ignored now and restored later. At step S13, a check is also made to ensure that the tags start and end in the proper order, and each start tag has a matching end tag within the segment. This is performed by bubbling up end tags that do not have matches within that segment, and then eliminating pairs of start and end tags that have no intervening content. At step S14, the segments are reconstituted into one string, again using the rich text node separator.
At step S15, the table related tags are restored which where ignored previously. At step S16, the html is broken into segments at the <table> tags, and then organized into a new rich text list 132 that includes entries that are either simple strings 133 (for rich text node entries) or vectors 134 (for table entries). The list version of resolveFromHtml method 136 is called to process this list. For the string entries, the resolveFromHtml method 136 for the rich text node 106 is called. These nodes may be added directly to the list of rich text nodes attached to the main rich text list 135. For the vector entries, the resolveFromHtml method 140 for that table node 137 creates a new rich text node 138 in the next position in its main rich text list 135, passing the vector that has the table information.
FIG. 7 is a block diagram showing steps and results of processing a table node. The table structure is again generally shown as 120, and is built in memory by successively resolving the tags through each type of table node, i.e., table header node 170 or table body node 172. The operations of table node 171 are essentially repeated by any succeeding table node type created by table node 171, substantially a recursive operation. The table node 171 reads the incoming tag up to the first end tag (>) to strip out its own tag information at step S16, then splits the rest at the next tag type, and passes each entry to that type of table node, either table header node 173 or table body node 172. For each table node created, the appropriate resolveFromHtml method is iteratively called to continue processing. Table row nodes 174 and row cell nodes 176 are created from the table body node 172. Heading cell nodes 175 are created from the table header node 173. The cell type tag nodes (i.e., th and td nodes) receive html strings that contain source for rich text nodes. These are used to set up rich text lists attached to the cell nodes.
Converting the Rich Text Memory Structure into Html
FIG. 8 shows a block diagram showing the results of processing a rich text list. Once a memory structure has been created representative of html, regenerating html from these structures can be accomplished by utilizing a toHtml method associated with each node in the memory structure. The toHtml method 180 is used by each node in the memory structure to write out its part of the total html based on information in that node, i.e., it renders rich text as html for use by a browser or the like. The rich text list 106 calls this method on its main list of rich text nodes 105 and processes them in order. If any rich text node 105 has a table node 171, it calls the toHtml method for that table node (so that the html for that table is added to the resulting html string 182 before the next node in the main rich text list 106 is added). Each node (e.g., 171 and 172) in the table structure adds its own tag information to the resulting html and then calls the toHtml method 180 for each of its dependent tags. This process continues until all nodes have been processed.
Representing the Rich Text Structure in a Relational Database
Rich text is stored as a string in a relational database. Because of the potentially large size of this string, it may be stored in a CLOB column. In order to make this as compact as possible, and to reduce the amount of tag information stored as text (this is to make searching less confusing), most of the tag information in each rich text node may be stored in a compressed format. Arrays are kept of the permitted font face and color values, and the index for those entries is stored into the array. Also, other attributes such as bold, italic, underline and whether the rich text node is an image tag are boolean attributes, and what is stored from them is a null string for false and a one byte string for true. The table nodes are stored in their html tag format, except that the cell nodes may use the relational format for their rich text nodes.
Databody fields can be stored in string, date, or numeric format and comprehensively represent the document contents. Rich text is an added type for the databody field that is stored in string format. An aggregate editor, which is capable of manipulating and editing a databody, recognizes the rich text type, and has a rich text list as one of its attributes to hold the memory representation of the rich text. This is converted into the string format for the relational database and assigned to the column that holds string values.
Retrieving the Rich Text Structure from a Relational Database
FIG. 9 is a block diagram showing components involved in processing a databody with rich text using an aggregate editor according to the present invention. Once the rich text structure is stored in a relational database according to aspects of the invention, it is retrievable for use such as editing and updating. If a databody field (e.g., 186) is defined as rich text, an aggregate editor 185 may retrieve the rich text string 186 from the column for string values 187 in the relational database 188 and convert that string into memory representation 189 using a toDb2 method 188 in its rich text list attribute. The toDb2 method 188 follows the same pattern as the toHtml method 180 described previously. A difference is that the string may be split into rich text nodes, so that the toDb2 method 188 for each rich text node 105 does a simple conversion of its portion of the string into corresponding attributes.
A particular consideration is the presentation of image tags that are BLOB references. These are modified to assure that the URL for the servlet is the current one. This is done in the memory representation of the rich text list. Each of its rich text nodes is checked to see if it is an image node representing a BLOB reference, and if so, the servlet portion of the URL is modified to match the current URL.
Presenting Rich Text for Editing Over the Web
FIG. 10 is a relational block diagram showing the relationship of components in editing a databody and the like with a browser according to the present invention. In FIG. 10, if the type of a databody field (e.g., 193 a) is rich text, then when a document is presented in edit mode using a rich text editor 190, that field is presented as read-only with a link above it so that when clicked by a user allows editing of the field. The link may be to a JAVASCRIPT method that brings up a rich text-editing window (i.e., a new browser window). This new window includes hidden html fields (i.e., hidden input fields) which contain the keys needed to process the field when edited (i.e., session key, manager key for the databody application class, row number of that databody field within the databody lists, etc.). This new window also passes the rich text converted into html using a resolveFromHtml method 191 for the rich text list attribute of the databody aggregate editor 185 rich text list 106. The rich text editor 190 may retrieve any images or attachments 191 from a database, shown in part, as a database row 193, using the servlet class doRichBlob 196 where the servlet is uploaded for parsing out of keys, byte array, etc.
In one type of the Web browser 198, the html for the rich text is assigned to a “content editable div” which allows the text to be edited directly. The rich text edit window is a somewhat simple html form. For other browsers 198 that do not provide native support for rich text edit, the rich text edit window is a frame. The frame includes two parts, as shown in FIG. 11D, one to edit the rich text as plain text, i.e., frame 210, using an applet 197, and a second frame, i.e., frame 211, to display the resulting rich text as it is edited. The same applet 197 may be used with known editors, but, in embodiments, may remain hidden. Applets are typically client-side JAVA programs that are loaded and run within the framework of a Web browser.
The applet 197 may be linked to the html edit window using the LiveConnect feature of JAVASCRIPT. In one browser version, each of the rich text editing functions 208 may call a JAVASCRIPT routine that invokes a function for rich text manipulation, and then passes the revised html to the applet 197. The applet 197 then processes the html, and writes the output back out to the “content editable div.” At its simplest, the applet 197 uses the html to create a rich text list structure in its memory, and then converts that rich text structure back into html. This cleans up the html and makes it well formed. In the case of image tags inserted into the rich text by the rich text editor 190, the applet 197 does a great deal more.
There are several functions in the EADP rich text classes to support the plain text editing of the rich text. One is a method on all the rich text nodes to render them into plain text. When a simple rich text node is rendered to plain text, its text is written to the output string, along with a one byte separator (a non-editable break character). The latter serves as a reminder that the plain text is really a representation of rich text, and also makes it easier to parse updates to the plain text representation to render it back into rich text. If the rich text node is an image node it reports itself in the plain text representation as an image or link. If it is the anchor point of a table node, it reports itself as a table. Note that the content of the table consists of titles and data cells, which are themselves rich text nodes, so it is possible to edit the table by editing its plain text representation.
FIGS. 11A-11D illustrate screen shot examples of rich text in browse and edit mode. FIGS. 1B-11D show screen shots in edit mode showing various edit selections 208 including in the body of the browser (FIG. 11B) and a tool bar (FIG. 11C). Another feature of the present invention is the ability to determine cursor position and selected text within the rich text node. The text area in the applet 197 is able to report the cursor position and the start and end of selected text in the plain text representation. This is then interpreted to determine which parts of text and in which rich text nodes have been selected. Since text selection is typically related to a change in font characteristics, the text node may need to be split to allow the change in face size or color. Each keystroke event in the plain text area is intercepted, and the plain text is written back into rich text in the area on the bottom of the frame. If table, lists, or file attachments are chosen, an image tag is generated to mimic what happens in a certain editor, and it may be inserted at the current cursor position.
Handling Tables, Lists, Images and File Attachments During Rich Text Editing and Presentations
When editing rich text and presentations using a browser, the memory structures and mechanisms to manage the representations of the rich text are consistently maintained as described above in order to provide overall controls for the editing operation. Examples of browser presentations and rich text editing options, illustrating the relationship between user interaction via a browser and the memory structures, are expanded further in conjunction with FIGS. 11A through 14B.
Rich text editing functions of some browsers implementing the present invention, provide two basic types of functions. The first is a variety of ways to change the font and text characteristics (this includes font face, font size, font color, bold, italic, and underlining). The second is the ability to insert an image at the current cursor position by specifying the local file name for that image. The third is the ability to indicate selected text through use of the insert link tag by specifying a special URL for the link that indicates the advanced function to perform. The advanced features of the rich text edit function are built on extensions of the image and link tag facilities. The native function of the browser may be used to create an image or link tag with a file name or URL that is overloaded with additional parameters. This is then intercepted by JAVASCRIPT functions or the hidden applet 197, and used to provide additional features.
One example of this is the way EADP-based rich text editing of the present invention allows insertion of table structures and lists into the rich text area. The button labeled “ListsAndTables” (FIG. 11B) (or the equivalent icons) invokes the image insertion function in the browser, but with a file name of “table”. When the hidden applet 197 intercepts the generated html, it first creates a rich text structure from the passed html, and then looks for an image tag with file name of “table.” If one exists, it brings up a frame (or panel) 212 a and 212 b that allows creation of tables and lists as shown in FIGS. 12A and 12B. The options available from these frames 212 a and 212 b depend on where in the rich text it is invoked. If it is invoked from an area of regular text the only options are to create a new table or list, as shown in FIG. 12A, frame 212 a. If it is invoked from within an existing table, there are options to add or modify columns, rows, and headers, as shown in FIG. 12B, frame 212 b. As can be seen, depending on which type of table element is chosen, the elements that can be specified change accordingly. When a selection and update is made in this frame, the applet 197 then uses the information to add or update a table node or list entry in its rich text structure in memory. This is then converted back into html and written back out to the rich text display area.
Referring now to FIGS. 13A and 13B, when the “Attachments” button 216 of FIG. 13B (or equivalent icon) is pressed, this invokes a JAVASCRIPT function that brings up a new html window (panel) 215 to process images, attachments, and links. This panel 215 allows selection of whether to process the file or URL as an image, attachment, or link as shown by 216. The source can be either a local file or an existing URL. For URLs, a new html window is opened (not shown) to allow selection of the URL when the Browse URL button 217 is pressed.
The file button 218 (FIG. 11A) on the browser tool bar invokes the standard input of type file provided by all Web browsers. This allows the file contents to be uploaded to the server. When this html window 215 is opened, the keys to the current text being edited are added as hidden input fields (e.g., the session key, the manager key, and the databody row number). If a local file is chosen, this information along with the file name is used to create a new entry for the file contents in the BLOB table in the relational database on the supporting server. This data is uploaded and stored immediately to avoid problems in a clustered server environment (i.e., it is typically too expensive in a clustered environment to attempt to try to store the BLOB contents in session memory). If a URL (e.g., Select URL button 219) is chosen as the source, there is no need to upload the data.
This panel 215 allows the addition of a great deal more formatting of data for the image or attachment. This includes aspects that are needed for well formed and accessible html such as the alt tag, the size of the image, and whether it should float. All this may be added to the file name that is assigned to the image tag. When the OK button is pressed, the file is uploaded if need be, and the image creation function on the parent panel is called. This adds the image tag with the overloaded file name to the html, and invokes the applet 197 to intercept and resolve the html. The applet 197 then creates the rich text structure in memory from the passed html. When it processes each image tag, it resolves the file name by parsing out any information that was added as an overload. This additional information is used to set additional parameters in the image tag, to change the image tag to represent a file attachment, or to indicate that the image tag should write itself out as a simple link, for example.
Providing Spell Checking
As a convenient feature during rich text editing, spell-checking operations is provided in the various embodiments of the present invention. The spell checking solution is optimized for use within a servlet environment. Servlets are typically server-side JAVA programs that are loaded and run within the framework of a web server. The dictionary functions all reside, preferably, on the server side, and reside as singletons in server memory so that they are extremely fast. The returned html includes all misspelled words and possible replacements so that JAVASCRIPT functions on the client side can provide an interactive and responsive spelling correction. The technique for dictionary creation and usage is also unique to this invention.
The spelling dictionary may be created initially from word lists then instantiated and serialized. The serialized hashtable is held as property files in the JAVA code for the EADP (or equivalent) dictionary class (e.g., EADPSpellCheckController). The structure of the dictionary is a hashtable, where the entries are lists of words. The keys to these entries are unique and provide powerful search ability. In embodiments, each word is assigned a set of characteristic signatures. These characteristics can be simplified or enriched depending on the capabilities of the server holding the dictionary. The possible sets of signatures are:
1. If the word length is less than three, the only signature is the word itself.
2. If the word is greater than eight, one signature is the first half of the word.
3. If the word length is greater than seven, the first three and last three characters are signatures.
4. If the word length is between four and seven, the first two and last two characters are signatures.
5. If the word length is greater than four, the first four and the last four characters are signatures.
6. If the word length equals four the first two characters plus the last character is a signature.
7. If the word length equals four, the first letter plus the last two letters is a signature.
The signatures can be enhanced on more powerful servers. It should be understood that each word may be added to the list keyed by each of its signatures. Also, each word has a primary signature, its first three or four letters (or the entire word if it is short). A word is checked for correctness initially by determining if it is a member of the word list for its primary signature. If a word is not correctly spelled, replacements are determined by using all its signatures to find the words in the list for those signatures.
When a word is checked for correctness, it is first checked to see if it is present in the list for its primary signature. If it is not there, then it is not spelled correctly. In this case, a substitution list is created for the word. That consists of creating a set of signatures for the misspelled word, finding all the words in the lists keyed by those signatures, and then selecting the twenty best matches (ranked as described next) to the word in question.
The ranking is accomplished by creating a common list of all the potential replacements. Each word only appears once in the common list, although it may have been found in more than one on the signature lists. Each word gets a score representing how many times it appeared on a signature list.
The top fifty (or other predetermined number) matches are selected based on this score. This is done by adding all words with a score of eight to the list of fifty, then all the ones with a score of seven and so on until fifty words are on the top fifty list. A consideration is made that if the match score is less than three, an additional criterion (e.g., whether the length of the replacement word is within two of the length of the misspelled word) is used for the selection.
The next filter is to find words in the top fifty list that match first or last parts of the misspelled word. The length to match starts at the length of the misspelled word minus one, and is successively decreased. At each stage, the words on the top fifty list that match for the length are added to the top twenty list, until it is filled. This provides a list of twenty (or possibly another size) replacements that has the most likely replacements at the top.
The EADPRichTextNode class includes a toSpellHtml method, which invokes the dictionary function for each word in its text attribute. If the node is an image tag or table anchor node, the toSpellHtml method returns the standard html for that node. The table nodes also have toSpellHtml methods that just invoke toHtml. The EADPRichTextList toSpellHtml method invokes the same method on each of its rich text nodes, which in turn cascade the method through the rich text structure. The resulting html string has the misspelled words and their replacements isolated by special separator tags. The font tags for the rich text node are repeated for each segment of text outside of the misspelled word.
When the spell check button (e.g., FIG. 11B) is pressed on the rich text edit panel, it submits a request to the server to convert the rich text to “spell html” format, and bring up the html for the spell check panel 220 of FIG. 14. The panel 220 is assigned the spell check version of the html as a hidden input field. The panel 220 has an area to display the rich text 221, a text area 222 to display the current misspelled word or its correction, an option list of possible corrections 223, and two buttons. The “Correct It” button 224 replaces the current misspelled word with whatever is in the text area (this could be the original spelling, a choice from the option list, or a manually typed in replacement) and moves on to the next word. The “Done” button 225 terminates spell check and moves back to the rich text edit panel.
FIG. 18 shows the steps of providing and using a spell check function for a rich text document that starts at step 460. At step 465, a spell check option is presented for a user to select a spell check function to locate a replacement word for a document with rich text. At step 470, either at the selection time of the spell option, or at another time, the dictionary is initialized so that each word in the dictionary has at least one signature to facilitate searching and retrieval of possible alternate substitutions for misspelled words. At step 475, creation of at least one signature for each word is accomplished by extracting one or more letters from the dictionary word and combining them to form the signature. This extraction and combination is performed according to the previously described alternatives. At step 477, a word of a document is determined not to be in the dictionary (i.e., void entry), then at step 480, at least one signature associated with the misspelled word is created so that at step 485, the dictionary can be searched using the signatures created in step 480, and are associated with the misspelled word, as keys to locate possible replacement or substitution word(s) in the dictionary. At step 490, one or more lists of possible word substitutions in reply to a prior request of the user are presented. At step 495, substitution of a word in the rich text document is performed while honoring the attributes of the original word that is replaced. This substitution is performed using classes and methods associated with the spell checker that makes use of, and is in harmony with, the rich text memory structure representation described previously. The process completes at step 496.
These features are not typical, and are supported by JAVASCRIPT functions that are unique to the present invention. These functions allow the spell check html to be presented and manipulated. Within the spell html, each misspelled word and its substitution list is isolated from the rest of the html by a separator string. That is, the spell html is split at these separators resulting in an array of strings where some of the entries are regular html and others are the misspelled words with the possible replacements separated by a different separator string. The next JAVASCRIPT function now glues this array back into html to present in the rich text area, with the regular html added. The array entries for the misspelled words are added by creating a font tag with a gray background in its style (to highlight the misspelled word) and Courier font, for example. The misspelled word is added, and an end font tag. The first misspelled word is assigned to the text area for the replacement, and its replacement list is parsed out and assigned to the option list. When the “Correct It” button is pressed, the replacement string for the misspelled word is merged into the regular html, and the entire process is repeated (the “next” misspelled word is now the first, so the effect is to work down through the misspelled words). When the “Done” button is pressed, all remaining misspelled words are merged back into the surrounding html and the corrected html string is submitted back to the server, which then assigns it to rich text edit panel.
Use of the Present Invention
The software classes described above include methods to instantiated the classes and to access the resulting objects. These software components may exist collectively or separately in libraries, in databases, on networks, on hard or floppy discs, tapes, or resident in various types of memories such as read-only, random access or removable memories. FIGS. 15A-17 may represent a high level block diagram implementing the steps of the present invention.
Referring to FIGS. 15A and 15B, the steps of using aspects of the present invention starts at step 300 and continues with representing rich text in a document in a memory structure representation as shown at step 305. At step 310, one or more classes are provided for use by Web based applications and browsers to create the memory structure. At step 315, the rich text class and rich text list class are instantiated, as necessary, by any associated program. At step 320, editing the rich text in a document using the rich text classes is performed. At step 325, well-formed segments of text (e.g., xml or html) to a current rich text node are formed from a rich text list node. This well-formed text is then parsed at step 330 and any unparsed text is assigned to the current node's attribute at step 335. At step 340, resolution of the current rich text node's text attribute is performed by extracting tag information and setting attributes in the rich text node. At step 345, some substitution strings are converted back to original values. At step 355, certain tags are suppressed (e.g., not relevant tags) by changing the starting and ending tags to substitution strings. At step 360, segments are reconstituted into one string and table related tags are restored at step 365. New rich text nodes are organized at step 370 by breaking segments at table tags and entries of a vector or a string are added as appropriate to the segments.
FIG. 16 shows steps of creating a rich text memory structure from text (e.g., resolveHtml method) starting at 375. At step 380, text is read until a tag (e.g., a first tag) is detected. If the text is a non-null string, the current rich text node is cloned to make a preceding rich text node and assign all text before the tag (i.e., the non-null string) (step 385). At step 390, a determination is made as to whether a string is null. If no text or tags is found, then the string is null and the process terminates at step 392. At step 395, a determination is made as to whether tag is a link or image tag. However, if the tag is an image tag or a link tag, then the current node is cloned to make a following node and text after the tag is assigned to the following node (step 400). The processing will then continue with step 415. However, if the tag is not an image tag or link tag, then a check is made whether the first tag has a matching end tag at step 405. If there is no matching end tag, at step 410, the current rich text node is cloned to make a following node and any text after the end tag is assigned to clone. Then, the text after the end tag is removed. At step 415, the information between the first tag and matching end tag is resolved (e.g., resolveTag method) and any text after the tag is removed. At step 420, the information between the first tag and the matching end tag is resolved to set up attributes in the current node. At step 422, set to any next non-null node, either a preceding or a following node as shown in step 422, if both exist, then they are done in order. Processing continues at step 380.
FIG. 17 shows the steps of using the present invention with interactions through a browser application or the like starting at step 425. At step 430, a response to a request is made for editing a document containing rich text. Rich text editing controls are presented for editing the document at step 435, as a response to the request. At step 440, changes are accepted to the document using the rich text class and rich text list class for editing. If a request for spell checking is made, the request is recognized and a response generated, at step 445. At step 450, a spell check panel is presented that displays spelling alternatives to a misspelled word. Upon selection of a substitution, a spelling substitution is accepted and entered into the rich text document using the rich text classes provided by this invention.
While the invention has been described in terms of preferred embodiments, those skilled in the art will recognize that the invention can be practiced with modifications and in the spirit and scope of the appended claims.

Claims (10)

What is claimed is:
1. A system for providing a spellchecker function and for use with documents having rich text, the system comprising:
a CPU, a computer readable memory and a computer readable storage media;
program instructions to initialize a dictionary containing words;
program instructions to create at least one signature for each dictionary word;
program instructions to add each dictionary word to at least one list keyed by each of the at least one signatures for each dictionary word;
program instructions to determine that a word is misspelled by checking the dictionary for the misspelled word resulting in a null value, the checking the dictionary comprising determining whether the misspelled word is present in the at least one list for a primary signature of the misspelled word, and when the misspelled word is not present in the at least one list, then the misspelled word is not spelled correctly resulting in the null value;
program instructions to create a substitution list for the misspelled word when the misspelled word is not spelled correctly, which includes:
creating at least one signature associated with the misspelled word;
finding all the dictionary words in the at least one list keyed by the at least one signature associated with the misspelled word; and
selecting best matches to the misspelled word; and
program instructions to provide from the selected best matches at least one replacement word for the misspelled word in the documents having rich text,
wherein the program instructions are stored on the computer readable storage media for execution by the CPU via the computer readable memory.
2. The system of claim 1, wherein the at least one signature associated with the misspelled word and for each dictionary word is provided by extracting one or more letters and combining the one or more letters.
3. The system of claim 2, wherein the extracting the one or more letters and the combining is provided according to at least one of the following:
when the dictionary word or misspelled word is less than three characters, the at least one signature is the dictionary word or misspelled word itself;
when the length of each of the dictionary word or misspelled word is greater than eight characters, one signature is the first half of the word;
when the length of the dictionary word or misspelled word is eight the first three and last three characters are each signatures;
when the length of the dictionary word or misspelled word is between four and seven, the first two characters and last two characters are each signatures;
when the length of the dictionary word or misspelled word equals four, the first two characters plus the last character is the signature;
when the length of the dictionary word or misspelled word is greater than four, the first four and the last four characters are each signatures; and
when the length of the dictionary word or misspelled word equals four, the first character plus the last two characters is a signature.
4. The system of claim 1, wherein the providing includes providing more than one replacement words in an ordered list for selection, wherein the more than one replacement words are ordered based upon a score.
5. The system of claim 1, further comprising program instructions to present a spell check panel that displays spelling alternatives to the misspelled word associated with the documents having rich text.
6. The system of claim 5, further comprising program instructions to search the dictionary to locate one or more words for presentation in the spell check panel.
7. The system of claim 6, wherein the creating the at least one signature for each dictionary word includes one or more words in the dictionary each having one or more associated signatures to aid in locating a match for the misspelled word.
8. The system of claim 1, wherein the dictionary is created from word lists which are instantiated and serialized, wherein a structure of the dictionary is a hashtable, and each dictionary word is assigned a set of signatures.
9. The system of claim 8, wherein each dictionary word has a primary signature.
10. The system of claim 9, wherein the primary signature includes a plurality of letters.
US13/948,728 2003-06-26 2013-07-23 Rich text handling for a web application Expired - Fee Related US9256584B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/948,728 US9256584B2 (en) 2003-06-26 2013-07-23 Rich text handling for a web application
US15/009,027 US10169310B2 (en) 2003-06-26 2016-01-28 Rich text handling for a web application

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US10/606,547 US7890852B2 (en) 2003-06-26 2003-06-26 Rich text handling for a web application
US12/940,462 US8566709B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application
US13/948,728 US9256584B2 (en) 2003-06-26 2013-07-23 Rich text handling for a web application

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/940,462 Continuation US8566709B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/009,027 Continuation US10169310B2 (en) 2003-06-26 2016-01-28 Rich text handling for a web application

Publications (2)

Publication Number Publication Date
US20130311879A1 US20130311879A1 (en) 2013-11-21
US9256584B2 true US9256584B2 (en) 2016-02-09

Family

ID=33540092

Family Applications (7)

Application Number Title Priority Date Filing Date
US10/606,547 Expired - Fee Related US7890852B2 (en) 2003-06-26 2003-06-26 Rich text handling for a web application
US12/940,479 Expired - Fee Related US8543909B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application
US12/940,462 Expired - Fee Related US8566709B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application
US13/941,688 Expired - Fee Related US9330078B2 (en) 2003-06-26 2013-07-15 Rich text handling for a web application
US13/948,728 Expired - Fee Related US9256584B2 (en) 2003-06-26 2013-07-23 Rich text handling for a web application
US15/009,027 Expired - Lifetime US10169310B2 (en) 2003-06-26 2016-01-28 Rich text handling for a web application
US15/085,032 Expired - Lifetime US10042828B2 (en) 2003-06-26 2016-03-30 Rich text handling for a web application

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US10/606,547 Expired - Fee Related US7890852B2 (en) 2003-06-26 2003-06-26 Rich text handling for a web application
US12/940,479 Expired - Fee Related US8543909B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application
US12/940,462 Expired - Fee Related US8566709B2 (en) 2003-06-26 2010-11-05 Rich text handling for a web application
US13/941,688 Expired - Fee Related US9330078B2 (en) 2003-06-26 2013-07-15 Rich text handling for a web application

Family Applications After (2)

Application Number Title Priority Date Filing Date
US15/009,027 Expired - Lifetime US10169310B2 (en) 2003-06-26 2016-01-28 Rich text handling for a web application
US15/085,032 Expired - Lifetime US10042828B2 (en) 2003-06-26 2016-03-30 Rich text handling for a web application

Country Status (1)

Country Link
US (7) US7890852B2 (en)

Families Citing this family (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7389265B2 (en) * 2001-01-30 2008-06-17 Goldman Sachs & Co. Systems and methods for automated political risk management
US8121937B2 (en) 2001-03-20 2012-02-21 Goldman Sachs & Co. Gaming industry risk management clearinghouse
US9715678B2 (en) 2003-06-26 2017-07-25 Microsoft Technology Licensing, Llc Side-by-side shared calendars
US7349918B2 (en) * 2003-06-30 2008-03-25 American Express Travel Related Services Company, Inc. Method and system for searching binary files
US7707255B2 (en) 2003-07-01 2010-04-27 Microsoft Corporation Automatic grouping of electronic mail
US8799808B2 (en) 2003-07-01 2014-08-05 Microsoft Corporation Adaptive multi-line view user interface
US9819624B2 (en) * 2004-03-31 2017-11-14 Google Inc. Displaying conversations in a conversation-based email system
US7814155B2 (en) 2004-03-31 2010-10-12 Google Inc. Email conversation management system
US7912904B2 (en) 2004-03-31 2011-03-22 Google Inc. Email system with conversation-centric user interface
US8321786B2 (en) * 2004-06-17 2012-11-27 Apple Inc. Routine and interface for correcting electronic text
US8996481B2 (en) 2004-07-02 2015-03-31 Goldman, Sach & Co. Method, system, apparatus, program code and means for identifying and extracting information
US8442953B2 (en) * 2004-07-02 2013-05-14 Goldman, Sachs & Co. Method, system, apparatus, program code and means for determining a redundancy of information
US8762191B2 (en) 2004-07-02 2014-06-24 Goldman, Sachs & Co. Systems, methods, apparatus, and schema for storing, managing and retrieving information
US8510300B2 (en) 2004-07-02 2013-08-13 Goldman, Sachs & Co. Systems and methods for managing information associated with legal, compliance and regulatory risk
US9015621B2 (en) 2004-08-16 2015-04-21 Microsoft Technology Licensing, Llc Command user interface for displaying multiple sections of software functionality controls
US7703036B2 (en) * 2004-08-16 2010-04-20 Microsoft Corporation User interface for displaying selectable software functionality controls that are relevant to a selected object
US7895531B2 (en) 2004-08-16 2011-02-22 Microsoft Corporation Floating command object
US8255828B2 (en) 2004-08-16 2012-08-28 Microsoft Corporation Command user interface for displaying selectable software functionality controls
US8146016B2 (en) 2004-08-16 2012-03-27 Microsoft Corporation User interface for displaying a gallery of formatting options applicable to a selected object
US7747966B2 (en) 2004-09-30 2010-06-29 Microsoft Corporation User interface for providing task management and calendar information
US7924285B2 (en) * 2005-04-06 2011-04-12 Microsoft Corporation Exposing various levels of text granularity for animation and other effects
US9542667B2 (en) 2005-09-09 2017-01-10 Microsoft Technology Licensing, Llc Navigating messages within a thread
US8627222B2 (en) 2005-09-12 2014-01-07 Microsoft Corporation Expanded search and find user interface
US7786979B2 (en) * 2006-01-13 2010-08-31 Research In Motion Limited Handheld electronic device and method for disambiguation of text input and providing spelling substitution
US7779353B2 (en) * 2006-05-19 2010-08-17 Microsoft Corporation Error checking web documents
US9727989B2 (en) 2006-06-01 2017-08-08 Microsoft Technology Licensing, Llc Modifying and formatting a chart using pictorially provided chart elements
US8209605B2 (en) * 2006-12-13 2012-06-26 Pado Metaware Ab Method and system for facilitating the examination of documents
US20080177782A1 (en) * 2007-01-10 2008-07-24 Pado Metaware Ab Method and system for facilitating the production of documents
US8201103B2 (en) 2007-06-29 2012-06-12 Microsoft Corporation Accessing an out-space user interface for a document editor program
US8762880B2 (en) 2007-06-29 2014-06-24 Microsoft Corporation Exposing non-authoring features through document status information in an out-space user interface
US8484578B2 (en) 2007-06-29 2013-07-09 Microsoft Corporation Communication between a document editor in-space user interface and a document editor out-space user interface
US20090199090A1 (en) * 2007-11-23 2009-08-06 Timothy Poston Method and system for digital file flow management
US20090228716A1 (en) * 2008-02-08 2009-09-10 Pado Metawsre Ab Method and system for distributed coordination of access to digital files
US8266524B2 (en) * 2008-02-25 2012-09-11 Microsoft Corporation Editing a document using a transitory editing surface
US9588781B2 (en) 2008-03-31 2017-03-07 Microsoft Technology Licensing, Llc Associating command surfaces with multiple active components
US8694904B2 (en) * 2008-04-15 2014-04-08 Microsoft Corporation Cross-browser rich text editing via a hybrid client-side model
US9507651B2 (en) 2008-04-28 2016-11-29 Microsoft Technology Licensing, Llc Techniques to modify a document using a latent transfer surface
US8238891B1 (en) * 2008-05-01 2012-08-07 Wendy W. Tam Method and system for interactive delivery of data content to mobile devices
US9665850B2 (en) 2008-06-20 2017-05-30 Microsoft Technology Licensing, Llc Synchronized conversation-centric message list and message reading pane
US8402096B2 (en) 2008-06-24 2013-03-19 Microsoft Corporation Automatic conversation techniques
US20090327876A1 (en) * 2008-06-25 2009-12-31 Microsoft Corporation User interface framework with embedded text formatting
US9552402B2 (en) * 2008-12-19 2017-01-24 International Business Machines Corporation System and method for exporting data to web-based applications
US8086953B1 (en) * 2008-12-19 2011-12-27 Google Inc. Identifying transient portions of web pages
US8719701B2 (en) * 2009-01-02 2014-05-06 Apple Inc. Identification of guides and gutters of a document
US9046983B2 (en) 2009-05-12 2015-06-02 Microsoft Technology Licensing, Llc Hierarchically-organized control galleries
TW201104462A (en) * 2009-07-24 2011-02-01 Inventec Corp System for generating customized dictionary and method thereof
US8312367B2 (en) * 2009-10-30 2012-11-13 Synopsys, Inc. Technique for dynamically sizing columns in a table
CN102096661A (en) * 2009-12-10 2011-06-15 国际商业机器公司 Method and system for editing electronic document on line
DE202011110867U1 (en) * 2010-04-12 2017-01-13 Google Inc. Rich-text and browser-based word processor
KR101782995B1 (en) * 2011-01-13 2017-09-29 삼성전자주식회사 method and apparatus of Web browsing through code caching and optimization for JavaScript
US8380753B2 (en) 2011-01-18 2013-02-19 Apple Inc. Reconstruction of lists in a document
US8442998B2 (en) 2011-01-18 2013-05-14 Apple Inc. Storage of a document using multiple representations
US8549399B2 (en) 2011-01-18 2013-10-01 Apple Inc. Identifying a selection of content in a structured document
US8930808B2 (en) * 2011-07-21 2015-01-06 International Business Machines Corporation Processing rich text data for storing as legacy data records in a data storage system
CA2772554A1 (en) * 2012-03-19 2013-09-19 Corel Corporation Method and system for interactive font feature access
CN104063363A (en) * 2013-03-19 2014-09-24 福建福昕软件开发股份有限公司北京分公司 Method for inserting wordart quickly in PDF document
US10867118B2 (en) * 2015-01-28 2020-12-15 Box, Inc. Method and system for implementing a collaboration platform for structured objects in a document
CN104615591B (en) * 2015-03-10 2019-02-05 上海触乐信息科技有限公司 Forward direction input error correction method and device based on context
US9940351B2 (en) * 2015-03-11 2018-04-10 International Business Machines Corporation Creating XML data from a database
CN105843787B (en) * 2016-03-24 2018-08-21 武汉斗鱼网络科技有限公司 A kind of RichText Edition method and system
CN106777161B (en) * 2016-12-20 2020-05-08 厦门美图移动科技有限公司 Memorandum setting method and device and mobile terminal
CN108089847A (en) * 2017-12-14 2018-05-29 易知成都数据服务有限公司 A kind of Components Development method based on ElementUI and UEditor rich texts
CN110929471A (en) * 2018-09-18 2020-03-27 深圳市鸿合创新信息技术有限责任公司 Method and terminal for displaying rich text and electronic equipment
CN110968991A (en) * 2018-09-28 2020-04-07 北京国双科技有限公司 Method and related device for editing characters
CN109714406B (en) * 2018-12-18 2021-04-02 网宿科技股份有限公司 Method and equipment for processing resource description file and acquiring page resource
CN109657184B (en) * 2018-12-19 2020-05-05 北京创鑫旅程网络技术有限公司 Rich text processing method, rich text processing device, server and computer readable medium
CN109753644B (en) * 2018-12-26 2023-11-28 百度在线网络技术(北京)有限公司 Rich text editing method and device, mobile terminal and storage medium
CN111666742B (en) * 2019-03-07 2023-04-18 阿里巴巴集团控股有限公司 Rich text processing method and device, electronic equipment and storage medium
CN109949391B (en) * 2019-03-18 2023-09-26 武汉斗鱼鱼乐网络科技有限公司 Image-text drawing method, image-text drawing device, electronic equipment and storage medium
US11526655B2 (en) 2019-11-19 2022-12-13 Salesforce.Com, Inc. Machine learning systems and methods for translating captured input images into an interactive demonstration presentation for an envisioned software product
CN111859850B (en) * 2020-07-29 2024-05-10 厦门亿联网络技术股份有限公司 Method, device, electronic equipment and storage medium for integrating rich text fragments
CN112035408B (en) * 2020-09-01 2023-10-31 文思海辉智科科技有限公司 Text processing method, device, electronic equipment and storage medium
JP6928334B1 (en) * 2020-09-03 2021-09-01 株式会社医療情報技術研究所 Document creation system
CN112291300B (en) * 2020-09-24 2022-12-06 南阳柯丽尔科技有限公司 Method and device for uploading rich text file
CN112560405B (en) * 2020-12-14 2024-04-05 央视国际网络无锡有限公司 Coloring method from word segmentation net text to rich format text
CN112769925B (en) * 2020-12-31 2023-12-15 南威软件股份有限公司 Picture uploading method, system, terminal and medium based on rich text editor
CN113050808B (en) * 2021-04-22 2023-11-28 北京百度网讯科技有限公司 Method and device for highlighting target text in input box
CN114444438A (en) * 2021-12-23 2022-05-06 北京罗克维尔斯科技有限公司 Text drawing method and device, electronic equipment and storage medium
CN115577683B (en) * 2022-11-23 2023-04-28 中国人民解放军国防科技大学 HTML rich text content conversion method, device, equipment and medium
US11837004B1 (en) * 2023-02-24 2023-12-05 Oracle Financial Services Software Limited Searchable table extraction
CN116842125B (en) * 2023-08-28 2023-12-26 武汉乾云软件开发中心(有限合伙) Storage method of rich media information and natural language intelligent retrieval method
CN117195840B (en) * 2023-11-08 2024-01-12 一网互通(北京)科技有限公司 Method and device for marking and inserting special-shaped objects in web editor in real time

Citations (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4833610A (en) 1986-12-16 1989-05-23 International Business Machines Corporation Morphological/phonetic method for ranking word similarities
US5604897A (en) 1990-05-18 1997-02-18 Microsoft Corporation Method and system for correcting the spelling of misspelled words
US5694610A (en) 1994-09-01 1997-12-02 Microsoft Corporation Method and system for editing and formatting data in a dialog window
US5787451A (en) 1995-05-08 1998-07-28 Microsoft Corporation Method for background spell checking a word processing document
US5832268A (en) 1993-08-04 1998-11-03 Anderson; James B. System and method for supporting complex objects in an object oriented computing environment
US5845306A (en) * 1994-06-01 1998-12-01 Mitsubishi Electric Information Technology Center America, Inc. Context based system for accessing dictionary entries
US5977967A (en) 1996-05-01 1999-11-02 Electronic Data Systems Corporation Object-oriented business object interface framework and method
US5991713A (en) 1997-11-26 1999-11-23 International Business Machines Corp. Efficient method for compressing, storing, searching and transmitting natural language text
US5999938A (en) 1997-01-31 1999-12-07 Microsoft Corporation System and method for creating a new data structure in memory populated with data from an existing data structure
US6047300A (en) * 1997-05-15 2000-04-04 Microsoft Corporation System and method for automatically correcting a misspelled word
US6085206A (en) 1996-06-20 2000-07-04 Microsoft Corporation Method and system for verifying accuracy of spelling and grammatical composition of a document
US6105036A (en) 1997-08-27 2000-08-15 International Business Machines Corporation Computer system and method of displaying a source code file with an ordered arrangement of object definitions
US6131102A (en) * 1998-06-15 2000-10-10 Microsoft Corporation Method and system for cost computation of spelling suggestions and automatic replacement
US6182092B1 (en) 1997-07-14 2001-01-30 Microsoft Corporation Method and system for converting between structured language elements and objects embeddable in a document
US20010042081A1 (en) 1997-12-19 2001-11-15 Ian Alexander Macfarlane Markup language paring for documents
US6330574B1 (en) 1997-08-05 2001-12-11 Fujitsu Limited Compression/decompression of tags in markup documents by creating a tag code/decode table based on the encoding of tags in a DTD included in the documents
US6345307B1 (en) 1999-04-30 2002-02-05 General Instrument Corporation Method and apparatus for compressing hypertext transfer protocol (HTTP) messages
US20020029229A1 (en) 2000-06-30 2002-03-07 Jakopac David E. Systems and methods for data compression
US6374210B1 (en) * 1998-11-30 2002-04-16 U.S. Philips Corporation Automatic segmentation of a text
US6381620B1 (en) 1997-12-10 2002-04-30 Matsushita Electric Industrial Co., Ltd. Rich text medium displaying method and picture information providing system using calculated average reformatting time for multimedia objects
US20020071139A1 (en) 2000-09-19 2002-06-13 Janik Craig M. Digital image frame and method for using the same
US6456209B1 (en) 1998-12-01 2002-09-24 Lucent Technologies Inc. Method and apparatus for deriving a plurally parsable data compression dictionary
US20020143521A1 (en) 2000-12-15 2002-10-03 Call Charles G. Methods and apparatus for storing and manipulating variable length and fixed length data elements as a sequence of fixed length integers
US20020147724A1 (en) 1998-12-23 2002-10-10 Fries Karen E. System for enhancing a query interface
US6470364B1 (en) 1998-02-24 2002-10-22 Sun Microsystems, Inc. Method and apparatus for generating text components
US6496202B1 (en) 1997-06-30 2002-12-17 Sun Microsystems, Inc. Method and apparatus for generating a graphical user interface
US20030007397A1 (en) 2001-05-10 2003-01-09 Kenichiro Kobayashi Document processing apparatus, document processing method, document processing program and recording medium
US20030014442A1 (en) 2001-07-16 2003-01-16 Shiigi Clyde K. Web site application development method using object model for managing web-based content
US20030088410A1 (en) 2001-11-06 2003-05-08 Geidl Erik M Natural input recognition system and method using a contextual mapping engine and adaptive user bias
US6601059B1 (en) 1998-12-23 2003-07-29 Microsoft Corporation Computerized searching tool with spell checking
US20030200254A1 (en) 2000-12-19 2003-10-23 Coach Wei Methods and techniques for delivering rich java applications over thin-wire connections with high performance and scalability
US20040148307A1 (en) 1999-12-02 2004-07-29 Rempell Steven H Browser based web site generation tool and run time engine
US20040230550A1 (en) 2003-04-03 2004-11-18 Simpson Michael J. Method and apparatus for electronic filing of patent and trademark applications and related correspondence
US6883137B1 (en) 2000-04-17 2005-04-19 International Business Machines Corporation System and method for schema-driven compression of extensible mark-up language (XML) documents
US7047493B1 (en) 2000-03-31 2006-05-16 Brill Eric D Spell checker with arbitrary length string-to-string transformations to improve noisy channel spelling correction
US7222298B2 (en) 2002-11-12 2007-05-22 Siemens Communications, Inc. Advanced JAVA rich text format generator
US7581170B2 (en) 2001-05-31 2009-08-25 Lixto Software Gmbh Visual and interactive wrapper generation, automated information extraction from Web pages, and translation into XML

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4029236A (en) * 1976-05-17 1977-06-14 Colgate-Palmolive Company Two product dispenser with cooperating telescoping cylinders
US4366919A (en) * 1978-05-01 1983-01-04 Coaxial Cartridges, Inc. Composite cartridge and device for metering extrusion of contents
US5301842A (en) * 1991-03-06 1994-04-12 Frank Ritter Multicomponent cartridge for plastic materials
US6173311B1 (en) 1997-02-13 2001-01-09 Pointcast, Inc. Apparatus, method and article of manufacture for servicing client requests on a network
US6253228B1 (en) 1997-03-31 2001-06-26 Apple Computer, Inc. Method and apparatus for updating and synchronizing information between a client and a server
US6185591B1 (en) 1997-07-29 2001-02-06 International Business Machines Corp. Text edit system with enhanced undo user interface
US6041255A (en) * 1998-04-16 2000-03-21 Kroll; Mark W. Disposable external defibrillator
US6336124B1 (en) 1998-10-01 2002-01-01 Bcl Computers, Inc. Conversion data representing a document to other formats for manipulation and display
US6519597B1 (en) 1998-10-08 2003-02-11 International Business Machines Corporation Method and apparatus for indexing structured documents with rich data types
AUPP719798A0 (en) * 1998-11-19 1998-12-17 Ramset Fasteners (Aust.) Pty. Limited A cartridge dispensing gun
US7490292B2 (en) 1999-12-17 2009-02-10 International Business Machines Corporation Web-based instruction
JP2001184344A (en) 1999-12-21 2001-07-06 Internatl Business Mach Corp <Ibm> Information processing system, proxy server, web page display control method, storage medium and program transmitter
US7421650B2 (en) 2001-05-01 2008-09-02 General Electric Company Method and system for publishing electronic media to a document management system in various publishing formats independent of the media creation application
US8065219B2 (en) * 2001-06-13 2011-11-22 Sungard Energy Systems Inc. System architecture and method for energy industry trading and transaction management
DE10128611A1 (en) * 2001-06-13 2002-12-19 Fischer Artur Werke Gmbh Ejection device for cartridge with two concentric chambers for building materials etc. has ejector ram of one-piece injection-molded part with internal ram part and external ring-shaped ram part for both chambers
DE50203865D1 (en) 2001-08-29 2005-09-15 Cavitron V Hagen & Funke Gmbh Device for processing materials
US20030079052A1 (en) 2001-10-24 2003-04-24 Kushnirskiy Igor Davidovich Method and apparatus for a platform independent plug-in

Patent Citations (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4833610A (en) 1986-12-16 1989-05-23 International Business Machines Corporation Morphological/phonetic method for ranking word similarities
US5604897A (en) 1990-05-18 1997-02-18 Microsoft Corporation Method and system for correcting the spelling of misspelled words
US5765180A (en) 1990-05-18 1998-06-09 Microsoft Corporation Method and system for correcting the spelling of misspelled words
US5832268A (en) 1993-08-04 1998-11-03 Anderson; James B. System and method for supporting complex objects in an object oriented computing environment
US5845306A (en) * 1994-06-01 1998-12-01 Mitsubishi Electric Information Technology Center America, Inc. Context based system for accessing dictionary entries
US5694610A (en) 1994-09-01 1997-12-02 Microsoft Corporation Method and system for editing and formatting data in a dialog window
US5787451A (en) 1995-05-08 1998-07-28 Microsoft Corporation Method for background spell checking a word processing document
US5977967A (en) 1996-05-01 1999-11-02 Electronic Data Systems Corporation Object-oriented business object interface framework and method
US6085206A (en) 1996-06-20 2000-07-04 Microsoft Corporation Method and system for verifying accuracy of spelling and grammatical composition of a document
US5999938A (en) 1997-01-31 1999-12-07 Microsoft Corporation System and method for creating a new data structure in memory populated with data from an existing data structure
US6047300A (en) * 1997-05-15 2000-04-04 Microsoft Corporation System and method for automatically correcting a misspelled word
US6496202B1 (en) 1997-06-30 2002-12-17 Sun Microsystems, Inc. Method and apparatus for generating a graphical user interface
US6182092B1 (en) 1997-07-14 2001-01-30 Microsoft Corporation Method and system for converting between structured language elements and objects embeddable in a document
US6330574B1 (en) 1997-08-05 2001-12-11 Fujitsu Limited Compression/decompression of tags in markup documents by creating a tag code/decode table based on the encoding of tags in a DTD included in the documents
US6105036A (en) 1997-08-27 2000-08-15 International Business Machines Corporation Computer system and method of displaying a source code file with an ordered arrangement of object definitions
US5991713A (en) 1997-11-26 1999-11-23 International Business Machines Corp. Efficient method for compressing, storing, searching and transmitting natural language text
US6381620B1 (en) 1997-12-10 2002-04-30 Matsushita Electric Industrial Co., Ltd. Rich text medium displaying method and picture information providing system using calculated average reformatting time for multimedia objects
US20010042081A1 (en) 1997-12-19 2001-11-15 Ian Alexander Macfarlane Markup language paring for documents
US6480206B2 (en) 1998-02-24 2002-11-12 Sun Microsystems, Inc. Method and apparatus for an extensible editor
US6470364B1 (en) 1998-02-24 2002-10-22 Sun Microsystems, Inc. Method and apparatus for generating text components
US6131102A (en) * 1998-06-15 2000-10-10 Microsoft Corporation Method and system for cost computation of spelling suggestions and automatic replacement
US6374210B1 (en) * 1998-11-30 2002-04-16 U.S. Philips Corporation Automatic segmentation of a text
US6456209B1 (en) 1998-12-01 2002-09-24 Lucent Technologies Inc. Method and apparatus for deriving a plurally parsable data compression dictionary
US20020147724A1 (en) 1998-12-23 2002-10-10 Fries Karen E. System for enhancing a query interface
US7444348B2 (en) 1998-12-23 2008-10-28 Microsoft Corporation System for enhancing a query interface
US6601059B1 (en) 1998-12-23 2003-07-29 Microsoft Corporation Computerized searching tool with spell checking
US6345307B1 (en) 1999-04-30 2002-02-05 General Instrument Corporation Method and apparatus for compressing hypertext transfer protocol (HTTP) messages
US20040148307A1 (en) 1999-12-02 2004-07-29 Rempell Steven H Browser based web site generation tool and run time engine
US7594168B2 (en) 1999-12-02 2009-09-22 Akira Technologies, Inc. Browser based web site generation tool and run time engine
US7047493B1 (en) 2000-03-31 2006-05-16 Brill Eric D Spell checker with arbitrary length string-to-string transformations to improve noisy channel spelling correction
US6883137B1 (en) 2000-04-17 2005-04-19 International Business Machines Corporation System and method for schema-driven compression of extensible mark-up language (XML) documents
US20020029229A1 (en) 2000-06-30 2002-03-07 Jakopac David E. Systems and methods for data compression
US20020071139A1 (en) 2000-09-19 2002-06-13 Janik Craig M. Digital image frame and method for using the same
US7178100B2 (en) 2000-12-15 2007-02-13 Call Charles G Methods and apparatus for storing and manipulating variable length and fixed length data elements as a sequence of fixed length integers
US20020143521A1 (en) 2000-12-15 2002-10-03 Call Charles G. Methods and apparatus for storing and manipulating variable length and fixed length data elements as a sequence of fixed length integers
US20030200254A1 (en) 2000-12-19 2003-10-23 Coach Wei Methods and techniques for delivering rich java applications over thin-wire connections with high performance and scalability
US7111011B2 (en) 2001-05-10 2006-09-19 Sony Corporation Document processing apparatus, document processing method, document processing program and recording medium
US20030007397A1 (en) 2001-05-10 2003-01-09 Kenichiro Kobayashi Document processing apparatus, document processing method, document processing program and recording medium
US7581170B2 (en) 2001-05-31 2009-08-25 Lixto Software Gmbh Visual and interactive wrapper generation, automated information extraction from Web pages, and translation into XML
US20030014442A1 (en) 2001-07-16 2003-01-16 Shiigi Clyde K. Web site application development method using object model for managing web-based content
US20030088410A1 (en) 2001-11-06 2003-05-08 Geidl Erik M Natural input recognition system and method using a contextual mapping engine and adaptive user bias
US7246060B2 (en) * 2001-11-06 2007-07-17 Microsoft Corporation Natural input recognition system and method using a contextual mapping engine and adaptive user bias
US7222298B2 (en) 2002-11-12 2007-05-22 Siemens Communications, Inc. Advanced JAVA rich text format generator
US20040230550A1 (en) 2003-04-03 2004-11-18 Simpson Michael J. Method and apparatus for electronic filing of patent and trademark applications and related correspondence

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Chen et al., "A GUI Environment to Manipulate FSMs for Testintg GUI-based Applications in Java", 2001, IEEE, pp. 1-10.
Final Office Action dated Nov. 19, 2015 in related U.S. Appl. No. 13/941,688, 14 pages.
Office Action dated May 8, 2015 for U.S. Appl. No. 13/941,688, 16 pages.

Also Published As

Publication number Publication date
US20160147732A1 (en) 2016-05-26
US20130311879A1 (en) 2013-11-21
US20110055690A1 (en) 2011-03-03
US20040268235A1 (en) 2004-12-30
US8543909B2 (en) 2013-09-24
US8566709B2 (en) 2013-10-22
US9330078B2 (en) 2016-05-03
US10169310B2 (en) 2019-01-01
US20130305141A1 (en) 2013-11-14
US10042828B2 (en) 2018-08-07
US20160210272A1 (en) 2016-07-21
US20110055686A1 (en) 2011-03-03
US7890852B2 (en) 2011-02-15

Similar Documents

Publication Publication Date Title
US10042828B2 (en) Rich text handling for a web application
US11288338B2 (en) Extracting a portion of a document, such as a page
US7516401B2 (en) Function-based object model for analyzing a web page table in a mobile device by identifying table objects similarity in function
US6021416A (en) Dynamic source code capture for a selected region of a display
US8359550B2 (en) Method for dynamically generating a “table of contents” view of the HTML-based information system
US6857102B1 (en) Document re-authoring systems and methods for providing device-independent access to the world wide web
US6799299B1 (en) Method and apparatus for creating stylesheets in a data processing system
US7111011B2 (en) Document processing apparatus, document processing method, document processing program and recording medium
US6708311B1 (en) Method and apparatus for creating a glossary of terms
US7296229B2 (en) Method and apparatus for providing a central dictionary and glossary server
US8346803B2 (en) Dynamic generation of target files from template files and tracking of the processing of target files
US7277879B2 (en) Concept navigation in data storage systems
US20080243791A1 (en) Apparatus and method for searching information and computer program product therefor
US7546533B2 (en) Storage and utilization of slide presentation slides
JP3023943B2 (en) Document search device
US20070005649A1 (en) Contextual title extraction
JPH11110384A (en) Method and device for retrieving and displaying structured document
JP3343941B2 (en) Example sentence search system
JP2015162107A (en) Correspondence relation extraction device, correspondence relation extraction method, and correspondence relation extraction program
KR20100014116A (en) Wi-the mechanism of rule-based user defined for tab
JP3193249B2 (en) Keyword search method
JPH0863483A (en) Information analysis and editing system
JPH11338867A (en) Document summarizing method and device and storage medium storing document summarizing program

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:WASON, JAMES R.;REEL/FRAME:030859/0514

Effective date: 20130718

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20200209

AS Assignment

Owner name: KYNDRYL, INC., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:057885/0644

Effective date: 20210930