[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112231034A - Software interface element identification method and device combining RPA and AI - Google Patents

Software interface element identification method and device combining RPA and AI Download PDF

Info

Publication number
CN112231034A
CN112231034A CN202011126611.2A CN202011126611A CN112231034A CN 112231034 A CN112231034 A CN 112231034A CN 202011126611 A CN202011126611 A CN 202011126611A CN 112231034 A CN112231034 A CN 112231034A
Authority
CN
China
Prior art keywords
interface
elements
similarity
primitive
structural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011126611.2A
Other languages
Chinese (zh)
Inventor
张小勇
罗亮
褚瑞
李玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Benying Network Technology Co Ltd
Beijing Laiye Network Technology Co Ltd
Original Assignee
Beijing Benying Network Technology Co Ltd
Beijing Laiye Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Benying Network Technology Co Ltd, Beijing Laiye Network Technology Co Ltd filed Critical Beijing Benying Network Technology Co Ltd
Publication of CN112231034A publication Critical patent/CN112231034A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a method and a device for identifying software interface elements by combining RPA and AI, which relate to the field of RPA and AI and comprise the following steps: extracting interface elements in a current software interface; performing similarity operation based on the structural mode of the target element and the interface element; and determining the distribution information of the target element on the current software interface according to the similarity operation result. Therefore, the matching accuracy of the RPA and the AI on the interface elements on the software interface in the robot process automation process can be improved, the realization mode is simple, and the effect is stable and reliable.

Description

Software interface element identification method and device combining RPA and AI
Technical Field
The application relates to the technical field of robot Process Automation (RPA for short) and Artificial Intelligence (AI for short), in particular to a method and a device for identifying software interface elements by combining the RPA and the AI.
Background
Robot Process Automation (RPA) is a Process task automatically executed according to rules by simulating human operations on a computer through specific robot software. Artificial Intelligence (AI) is a technical science that studies and develops theories, methods, techniques and application systems for simulating, extending and expanding human Intelligence. At present, the RPA and AI technologies have the advantages of high automation degree, high accuracy and low cost, and are widely applied.
In the prior art, in order to ensure the accuracy of an automation process, a software robot needs to accurately identify the position of a target element and perform automation operation on the target element when the software robot runs the process. In application scenarios such as remote desktop or virtual machine, interface elements are generally detected by computer vision technology, and characteristic attributes of the interface elements are extracted as matching bases of the interface elements during process operation.
However, such a matching method is not stable, and it is easy to cause matching errors or matching failures of the target elements, so that the accuracy of the automated process is low.
Disclosure of Invention
The application provides a software interface element identification method and device combining RPA and AI, which can improve the matching accuracy of interface elements on a software interface in a robot process automation process, and is simple in implementation mode and stable and reliable in effect.
In a first aspect, the present application provides a method for identifying software interface elements in combination with RPA and AI, including:
extracting interface elements in a current software interface;
performing similarity operation based on the structural mode of the target element and the interface element;
and determining the distribution information of the target element on the current software interface according to the similarity operation result.
In one possible design, the extracting interface elements in the current software interface includes:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural pattern includes:
a primitive set consisting of target elements and structural elements; and a position relation set of the position relation between every two elements in the primitive set.
In one possible design, the performing similarity calculation based on the structural model of the target element and the interface element includes:
determining all approximate primitive sets in the current software interface according to the primitive sets;
for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on the similarity between each element in the primitive set and each element in the primitive set;
determining a second similarity of each approximate primitive set based on the position relationship between every two elements in each approximate primitive set and the similarity of the position relationship between every two corresponding elements in the primitive set;
and determining the total similarity of the primitive set and each approximate primitive set based on the first similarity set and the second similarity.
In one possible design, determining all approximate primitive sets in the current software interface from the primitive sets includes:
searching interface elements matched with target elements in the primitive set to form a first interface element set corresponding to the target elements;
respectively searching interface elements matched with all structural elements in the primitive set to form a second interface element set corresponding to all the structural elements; wherein each structural element in the primitive set corresponds to an independent second interface element set;
and respectively selecting any interface element from the first interface element set and each second interface element set to form the approximate primitive set.
In one possible design, for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on similarity between each element and each element in the primitive set, includes:
and obtaining the similarity between a first interface element in the approximate primitive set and a target element in the primitive set, and the similarity between each second interface element in the approximate primitive set and each corresponding structural element in the primitive set, so as to obtain the first similarity set of the approximate primitive set.
In one possible design, determining a second similarity of each approximate primitive set based on a similarity between a position relationship between two elements in each approximate primitive set and a position relationship between corresponding two elements in the primitive set includes:
combining elements in the primitive set pairwise to form a sub-mode set of the primitive set;
combining elements in each approximate primitive set pairwise to form a sub-mode set of the approximate primitive set;
and calculating the similarity between each element in the sub-mode set of the approximate primitive set and each element in the sub-mode set of the primitive set to obtain a second similarity set of each approximate primitive set.
In one possible design, determining distribution information of the target element on the current software interface according to the result of the similarity operation includes:
selecting an approximate primitive set with the maximum total similarity as a candidate set;
and if the total similarity of the candidate set is greater than a preset threshold, determining the distribution information of the target element on the current software interface according to the position relation of the interface elements in the candidate set.
In a possible design, after determining distribution information of the target element on the current software interface according to the result of the similarity operation, the method further includes:
and executing the access operation on the target element according to the distribution information.
In a possible design, the performing similarity calculation based on the structural model of the target element and the interface element further includes:
and performing similarity matching operation on the structural mode corresponding to the target element and the structural mode corresponding to each interface element in the current software interface.
In a possible design, before performing the similarity operation based on the structural mode of the target element and the interface element, the method further includes:
extracting all interface elements of the template software interface as candidate elements;
and selecting a target element from the candidate elements, and acquiring a structural mode corresponding to the target element.
In a possible design, the selecting a target element from the candidate elements and obtaining a structural mode corresponding to the target element includes:
acquiring a structural element associated with the target element;
generating a candidate mode of the target element according to the position relation between the target element, the structural element, the target element and the structural element;
if the position relation corresponding to the candidate mode in the template software interface is determined not to be unique, reselecting the structural element associated with the target element;
and determining the position relation of the candidate mode in the template software interface, and taking the candidate mode as a structural mode corresponding to the target element.
In one possible design, the extracting all interface elements of the template software interface as candidate elements includes:
intercepting an interface image of a template software interface;
and extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural elements include: any one or more of an icon element, a text element, a key element.
In a second aspect, the present application further provides an apparatus for identifying software interface elements in combination with RPA and AI, including:
the extraction module is used for extracting interface elements in the current software interface;
the matching module is used for carrying out similarity operation based on the structural mode of the target element and the interface element;
and the identification module is used for determining the distribution information of the target element on the current software interface according to the similarity operation result.
In one possible design, the extraction module is specifically configured to:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural pattern includes:
a primitive set consisting of target elements and structural elements; and a position relation set of the position relation between every two elements in the primitive set.
In one possible design, the matching module is specifically configured to:
determining all approximate primitive sets in the current software interface according to the primitive sets;
for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on the similarity between each element in the primitive set and each element in the primitive set;
determining a second similarity of each approximate primitive set based on the position relationship between every two elements in each approximate primitive set and the similarity of the position relationship between every two corresponding elements in the primitive set;
and determining the total similarity of the primitive set and each approximate primitive set based on the first similarity set and the second similarity.
In one possible design, the matching module is specifically configured to:
searching interface elements matched with target elements in the primitive set to form a first interface element set corresponding to the target elements;
respectively searching interface elements matched with all structural elements in the primitive set to form a second interface element set corresponding to all the structural elements; wherein each structural element in the primitive set corresponds to an independent second interface element set;
and respectively selecting any interface element from the first interface element set and each second interface element set to form the approximate primitive set.
In one possible design, the matching module is specifically configured to:
and obtaining the similarity between a first interface element in the approximate primitive set and a target element in the primitive set, and the similarity between each second interface element in the approximate primitive set and each corresponding structural element in the primitive set, so as to obtain the first similarity set of the approximate primitive set.
In one possible design, the matching module is specifically configured to:
combining elements in the primitive set pairwise to form a sub-mode set of the primitive set;
combining elements in each approximate primitive set pairwise to form a sub-mode set of the approximate primitive set;
and calculating the similarity between each element in the sub-mode set of the approximate primitive set and each element in the sub-mode set of the primitive set to obtain a second similarity set of each approximate primitive set.
In one possible design, the identification module is specifically configured to:
selecting an approximate primitive set with the maximum total similarity as a candidate set;
and if the total similarity of the candidate set is greater than a preset threshold, determining the distribution information of the target element on the current software interface according to the position relation of the interface elements in the candidate set.
In one possible design, further comprising: an execution module to:
and executing the access operation on the target element according to the distribution information.
In one possible design, the matching module is further configured to:
and performing similarity matching operation on the structural mode corresponding to the target element and the structural mode corresponding to each interface element in the current software interface.
In one possible design, further comprising: an acquisition module to:
extracting all interface elements of the template software interface as candidate elements;
and selecting a target element from the candidate elements, and acquiring a structural mode corresponding to the target element.
In one possible design, the obtaining module is further configured to:
acquiring a structural element associated with the target element;
generating a candidate mode of the target element according to the position relation between the target element, the structural element, the target element and the structural element;
if the position relation corresponding to the candidate mode in the template software interface is determined not to be unique, reselecting the structural element associated with the target element;
and determining the position relation of the candidate mode in the template software interface, and taking the candidate mode as a structural mode corresponding to the target element.
In one possible design, the obtaining module is further configured to:
intercepting an interface image of a template software interface;
and extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural elements include: any one or more of an icon element, a text element, a key element.
In a third aspect, the present application further provides an electronic device, including:
a processor; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform any one of the methods of identifying software interface elements in conjunction with RPA and AI of the first aspect via execution of the executable instructions.
In a fourth aspect, the present application further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for identifying a software interface element in combination with an RPA and an AI according to any one of the first aspect.
The application provides a software interface element identification method, device, equipment and storage medium combining RPA and AI, which carry out similarity calculation through a structural mode based on a target element and the interface element; and determining the distribution information of the target element on the current software interface according to the similarity operation result. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a diagram illustrating an application scenario of a recognition method for software interface elements in combination with RPA and AI according to an example embodiment;
FIG. 2 is a flow diagram illustrating a method for identifying software interface elements that incorporate RPA and AI in accordance with an exemplary embodiment of the present application;
FIG. 3 is a flow diagram illustrating a method for identifying software interface elements that incorporate RPA and AI in accordance with another example embodiment of the present application;
FIG. 4 is a block diagram illustrating an apparatus for identifying software interface elements in conjunction with RPA and AI according to an example embodiment;
FIG. 5 is a block diagram illustrating an apparatus for identifying software interface elements in conjunction with RPA and AI according to another example embodiment;
fig. 6 is a schematic structural diagram of an electronic device shown in the present application according to an example embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the above-described drawings (if any) are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the field of Robot Process Automation (RPA), in order to implement Automation of a Process, a software robot needs to frequently access control elements (interface elements for short) on a software interface and operate on the interface elements to execute corresponding operation tasks. In the prior art, in order to ensure the accuracy of an automation process, a software robot needs to accurately identify the position of a target element and perform automation operation on the target element when the software robot runs the process. In application scenarios such as remote desktop or virtual machine, interface elements are generally detected by computer vision technology, and feature attributes of the interface elements are extracted as matching bases of the interface elements during process operation. However, such a matching method is not stable, and it is easy to cause matching errors or matching failures of the target elements, so that the accuracy of the automated process is low.
In view of the above technical problems, the application provides a method and a device for identifying software interface elements by combining RPA and AI, which can improve the accuracy of matching interface elements on a software interface in a robot process automation process, and have the advantages of simple implementation mode and stable and reliable effect. Fig. 1 is a diagram illustrating an application scenario of a recognition method for software interface elements combining RPA and AI according to an example embodiment of the present application, where as shown in fig. 1, characteristic attribute information of an interface element (e.g., an input box) is not stable and reliable, and it is easy to cause an error in matching or a failure in matching the interface element. In fact, the environmental information around the interface element can be fully utilized for matching and positioning. Without loss of generality, as shown in fig. 1, the target element to be matched is an input box control under the "host name (H)", and the surrounding environment information is set as three interface elements, i.e., "auxiliary matching element 1", "auxiliary matching element 2", and "auxiliary matching element 3", as shown in the figure. The 4 interface elements (1 target element +3 auxiliary matched elements) form a certain structural graph in a spatial plane, such as an area shown by a thick line in fig. 1, the structural graph can be regarded as a structural pattern, and a structural pattern which is consistent with or similar to the structural pattern in the software interface can be searched by adopting a structural pattern recognition method, so as to finally determine the position of the target element. In general, it is necessary to determine a target element of a certain type as an input box, where there is an input box on the top, right, and bottom sides, and the interface element that satisfies this condition is only the input box element below the "host name (H)", which is the target element to be searched.
The method can improve the matching accuracy of the interface elements on the software interface in the robot process automation process, and has the advantages of simple implementation mode and stable and reliable effect.
Fig. 2 is a flowchart illustrating a method for identifying a software interface element that combines an RPA and an AI according to an example embodiment of the present application, where as shown in fig. 2, the method provided in this embodiment may include:
step 101, extracting interface elements in the current software interface.
In this embodiment, the software robot may intercept an interface image of the current software interface. Then, based on a Natural Language Processing (NLP) technology and a Natural Language Understanding (NLU) technology, all interface elements are extracted from the interface image through an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
Specifically, interface elements in the software interface mainly include text, icons, and controls. In general, there is a text element (Label) in the control element to identify it, such as: there is typically a simple text inside the button that identifies the function of the button (e.g., "OK" or "Cancel"), etc.; there will also be a simple text on the left or top side of the input box button to identify the function of the input box (e.g., "username" or "password"), etc.); therefore, when the matching search is performed on the interface element, the Label information used as the identification can be sufficiently utilized for assistance. These Label information are referred to herein as "anchor points". The anchor point is more generally defined and described below. "anchor point" is understood to mean a reference point, similar to a landmark, which is morphologically stable (variable in position), easily recognizable, and globally unique. Here, an "anchor point" may be an icon or a piece of text. Therefore, the text elements are detected by an Optical Character Recognition (OCR) technology, and the position and the character content of each section of text in the interface are detected; for the icons and the control elements, the positions and the types of the icons and the control elements in the interface can be detected through a deep learning target detection algorithm (such as SSD \ Faster R-CNN).
And 102, performing similarity calculation based on the structural mode of the target element and the interface element.
In this embodiment, the structural modes include: a primitive set consisting of target elements and structural elements; and a position relation set of the position relation between every two elements in the primitive set.
In the embodiment, all approximate primitive sets in the current software interface are determined according to the primitive sets; for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on the similarity between each element in the primitive set and each element in the primitive set; determining a second similarity set of each approximate primitive set based on the position relationship between every two elements in each approximate primitive set and the similarity of the position relationship between every two corresponding elements in the primitive set; and determining the total similarity of the primitive set and each approximate primitive set based on the first similarity set and the second similarity set.
Illustratively, the primitive set is denoted as set _ E, _ E:
Figure BDA0002733814470000091
wherein _ t0Representing target elementsA, a0Denotes the 1 st structural element, _ anRepresents the nth structural element;
searching interface elements matched with all elements in the primitive set from the current software interface, wherein all the found interface elements form a set C, C:
Figure BDA0002733814470000092
wherein, t00Indicating the 1 st interface element matching the target element, t01Representing the 2 nd interface element, t, matching the target element0xRepresenting the xth interface element matched with the target element; a is00Denotes the 1 st interface element which matches the 1 st structural element, a01Denotes the 2 nd interface element, a, which matches the 1 st structural element0yRepresenting the interface element of the y th matched with the 1 st structural element; a isn0Denotes the 1 st interface element which matches the nth structural element, an1Representing the 2 nd interface element matched with the nth structural element, anzRepresenting the z interface element matched with the n structural element; wherein x, y and n are natural numbers larger than 0;
and selecting one element from each row of elements in the set C to form an approximate primitive set.
Illustratively, respectively calculating the similarity between each element in the set C and the element at the corresponding position in the primitive set to obtain a similarity set S1;
S1:
Figure BDA0002733814470000093
wherein st00Representing an interface element t00And target element _ t0Similarity between them, st01Representing an interface element t01And target element _ t0Similarity between them, st0xRepresenting an interface element t0xAnd target element _ t0The similarity between them; sa00Representing an interface element a00Similarity with the 1 st structural element, sa01Representing an interface element a01Similarity with the 1 st structural element, sa0yRepresenting interface elementsa0ySimilarity with the 1 st structural element; san0Representing an interface element an0Similarity with the nth structural element, san1Representing an interface element an1Similarity with the nth structural element, sanzRepresenting an interface element anzSimilarity with the nth structural element; and according to the similarity set S1, searching the similarity corresponding to each element of each approximate primitive set, and calculating the similarity of each element in the approximate primitive set to obtain a first similarity set of each approximate primitive set.
Illustratively, elements in the primitive set are combined pairwise to form a sub-mode set _ L of the primitive set; combining the elements in each approximate primitive set pairwise to form a sub-mode set L of the approximate primitive setiWherein, the value of i is 1, 2., and M represents the total number of the approximate primitive set; and calculating the similarity between each element in the sub-mode set of the approximate primitive set and each element in the sub-mode set of the primitive set to obtain a second similarity set of each approximate primitive set.
Illustratively, the sum of the respective similarities in the first similarity set and the second similarity set of the approximate primitive set is taken as the total similarity of the approximate primitive set.
One skilled in the art may determine the total similarity of the approximate primitive set based on the first similarity set and the second similarity set in other ways, and the application is not limited thereto.
Specifically, the target element refers to a target interface element which needs to be matched and searched, and may be a control, a text or an icon. The structural element is used as environment information for assisting matching and searching of the target element, and can be a control, a text or an icon. The element position relationship refers to a relative position relationship between elements, such as 8 orientations of the upper left, the upper right, the upper side and the lower side, and may be a relative distance and an angle between two elements. Together, the target elements, the structural elements, and the element positional relationships constitute a structural schema. The target element and the structural element are collectively referred to as a schema primitive,
the matching mode of the interface element and the pattern primitive follows the following rules: if the mode primitive is a common control, matching the type of the control; if the text is the text, matching the text character string; and if the icon is the icon, matching by using a template matching mode.
For primitive set _ E (containing n +2 elements), two elements are combined (e.g.. sub.t)0&_ai,_ai&_ajEtc.) a first set _ L of sub-patterns constituting the "element position relationship", obviously, the set _ L has
Figure BDA0002733814470000103
And (4) each child item.
For set C, one element is arbitrarily taken from each row to form an approximate primitive set E, and it is apparent that the set E has k ═ x +1 × (y +1) × … × (z +1) sub-terms. Then, for each sub-item E of EiSimilar to set _ E, the elements are combined pairwise to form a second set L of candidate sub-patternsiIn the same way, set LiAlso have
Figure BDA0002733814470000102
And (4) each child item. Each set can be represented as follows
Figure BDA0002733814470000101
Wherein u in the first set _ LiRepresents the ith sub-mode (e.g.. a)i&_aj) (ii) a E in set EiRepresents one possible combination pattern; second set LiL in (1)ixRepresents candidate pattern EiThe x sub-pattern (e.g. a)0i&axj)。
To identify the candidate pattern E that is most similar to the primitive set _ EiA second set L of matching sub-patterns is requirediSimilarity with the first set L of target sub-patterns. L isiThe matching with _ L is performed by matching the corresponding elements respectively (e.g.. L;)jAnd lij) And calculating the average phase thereofThe similarity value is taken as the second similarity S2. Sub-pattern _ ljAnd lijThe matching basis of (1) is mainly the relative relation of spatial positions, and if the spatial relative positions are consistent or close, the matching degree is high; otherwise, the matching degree is low (or 0).
And 103, determining the distribution information of the target element on the current software interface according to the similarity operation result.
In the embodiment, the approximate primitive set with the maximum total similarity is selected as a candidate set; and if the total similarity of the candidate set is greater than a preset threshold, determining the distribution information of the target element on the current software interface according to the position relation of the interface elements in the candidate set.
Specifically, each approximate primitive set E can be obtained through step 102i. Based on EiDetermining total similarity S by the first similarity set and the second similarity set, and taking the maximum value S of SmaxIf S ismaxAnd if the value is greater than the set threshold value TH, determining that the corresponding approximate primitive set is most matched with the structural mode of the target element, and determining the distribution information of the target element on the current software interface according to the position relation between the interface elements in the approximate primitive set.
In the embodiment, interface elements in the current software interface are extracted; performing similarity calculation based on the structural mode of the target element and the interface element; and determining the distribution information of the target elements on the current software interface according to the similarity operation result. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Further, after determining the distribution information of the target element on the current software interface, an access operation to the target element may be performed according to the distribution information.
In one possible implementation, the access operation to the target element is performed based on the distribution information.
Specifically, after the distribution information of the target element is acquired, the target element may be accessed, for example, a pick-up and simulation operation of the target element.
Fig. 3 is a flowchart illustrating a method for identifying a software interface element that combines an RPA and an AI according to another exemplary embodiment, where as shown in fig. 3, the method provided in this embodiment may include:
step 201, acquiring a target element and a structural mode of a template software interface.
It should be noted that, before attempting to perform similarity calculation based on the structure pattern and the interface elements of the target element, all interface elements of the template software interface may be extracted as candidate elements, and the target element may be selected from the candidate elements, and the structure pattern corresponding to the target element may be obtained.
In this embodiment, an interface image of a template software interface may be intercepted; extracting all interface elements from an interface image of a template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model; selecting a target element and a structural element associated with the target element from the candidate elements; wherein the structural elements include: any one or more of icon elements, text elements, key elements; generating a candidate mode of the target element according to the position relation between the target element and the structural element; the candidate patterns include: a connection structure diagram formed by the spatial positions of the target element and the structural element; judging whether the form of the candidate mode is unique, if not, reselecting the structural elements associated with the target elements until the form of the formed candidate mode is unique; and taking the candidate pattern with the unique form as a structural pattern corresponding to the target element.
Specifically, an interface image of the template software interface may also be intercepted. Detecting the text elements by an OCR technology, and detecting the position and the character content of each section of text in the interface; for the icons and the control elements, the positions and the types of the icons and the control elements in the interface can be detected through a deep learning target detection algorithm (such as SSD \ Faster R-CNN). And selecting target elements to be operated by taking all the extracted interface elements as candidate elements, wherein the target elements can be controls, texts and icons, and the number of the target elements is only one. And then, selecting interface elements around the target elements as structural elements. The structural elements can be controls, texts and icons, and the number of the structural elements can be single or multiple. Generating a candidate mode of the target element according to the position relation between the target element and the structural element; the candidate patterns include: a connected structure pattern formed by the spatial positions of the target element and the structural element. In order to ensure the accurate search of the structure mode, whether the candidate mode is unique needs to be checked, if not, the structure elements related to the target elements are reselected until the form of the formed candidate mode is unique; and taking the candidate pattern with the unique form as a structural pattern corresponding to the target element. Generating characteristic information of the structure mode of the target element and storing the characteristic information into an RPA process source code, wherein the characteristic information mainly comprises the category, the position and the text content of the target element; the category, position and text content of the structural element.
Step 202, extracting interface elements in the current software interface.
And step 203, performing similarity calculation based on the structural mode of the target element and the interface element.
And 204, determining the distribution information of the target element on the current software interface according to the similarity operation result.
And step 205, executing the access operation on the target element according to the distribution information.
In this embodiment, please refer to the related description in step 101 to step 104 in the method shown in fig. 2 for the specific implementation process and technical principle of step 202 to step 205, which is not described herein again.
In the embodiment, interface elements in the current software interface are extracted; performing similarity calculation based on the structural mode of the target element and the interface element; determining the distribution information of the target elements on the current software interface according to the similarity operation result; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
In addition, the embodiment can also intercept the interface image of the template software interface; extracting all interface elements from an interface image of a template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model; selecting a target element and a structural element associated with the target element from the candidate elements; wherein the structural elements include: any one or more of icon elements, text elements, key elements; generating a candidate mode of the target element according to the position relation between the target element and the structural element; the candidate patterns include: a connection structure diagram formed by the spatial positions of the target element and the structural element; judging whether the form of the candidate mode is unique, if not, reselecting the structural elements associated with the target elements until the form of the formed candidate mode is unique; and taking the candidate pattern with the unique form as a structural pattern corresponding to the target element.
Fig. 4 is a schematic structural diagram illustrating a recognition apparatus for software interface elements that combines RPA and AI according to an example embodiment. As shown in fig. 4, the identifying device of the software interface element combining RPA and AI according to the present embodiment may include:
the extracting module 31 is configured to extract interface elements in the current software interface;
the matching module 32 is used for performing similarity calculation based on the structural mode of the target element and the interface element;
and the identification module 33 is configured to determine distribution information of the target element on the current software interface according to the similarity operation result.
In one possible design, the extraction module 31 is specifically configured to:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural modes include:
a primitive set consisting of target elements and structural elements; and a position relation set of the position relation between every two elements in the primitive set.
In one possible design, the matching module 32 is specifically configured to:
determining all approximate primitive sets in the current software interface according to the primitive sets;
for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on the similarity between each element in the primitive set and each element in the primitive set;
determining a second similarity of each approximate primitive set based on the position relationship between every two elements in each approximate primitive set and the similarity of the position relationship between every two corresponding elements in the primitive set;
and determining the total similarity of the primitive set and each approximate primitive set based on the first similarity set and the second similarity set.
In one possible design, the matching module 32 is specifically configured to:
the set of primitives is denoted as set _ E,
Figure BDA0002733814470000131
wherein _ t0Represents the target element, _ a0Denotes the 1 st structural element, _ anRepresents the nth structural element;
searching interface elements matched with all elements in the primitive set from the current software interface, wherein all the found interface elements form a set C,
Figure BDA0002733814470000132
wherein, t00Indicating the 1 st interface element matching the target element, t01Representing the 2 nd interface element, t, matching the target element0xRepresenting the xth interface element matched with the target element; a is00Denotes the 1 st interface element which matches the 1 st structural element, a01Denotes the 2 nd interface element, a, which matches the 1 st structural element0yRepresenting the interface element of the y th matched with the 1 st structural element; a isn0Denotes the 1 st interface element which matches the nth structural element, an1Representing the 2 nd interface element matching the nth structure element,anzrepresenting the z interface element matched with the n structural element; wherein x, y and n are natural numbers larger than 0;
and selecting one element from each row of elements in the set C to form an approximate primitive set.
In one possible design, the matching module 32 is specifically configured to:
similarity among elements at corresponding positions in the primitive set is obtained to obtain a similarity set S1; respectively calculating each element and place in the set C
Figure BDA0002733814470000133
Wherein st00Representing an interface element t00And target element _ t0Similarity between them, st01Representing an interface element t01And target element _ t0Similarity between them, st0xRepresenting an interface element t0xAnd target element _ t0The similarity between them; sa00Representing an interface element a00Similarity with the 1 st structural element, sa01Representing an interface element a01Similarity with the 1 st structural element, sa0yRepresenting an interface element a0ySimilarity with the 1 st structural element; san0Representing an interface element an0Similarity with the nth structural element, san1Representing an interface element an1Similarity with the nth structural element, sanzRepresenting an interface element anzSimilarity with the nth structural element;
and according to the similarity set S1, searching the similarity corresponding to each element of each approximate primitive set, and calculating the average value of the sum of the similarities of each element in the approximate primitive set to obtain a first similarity set of each approximate primitive set.
In one possible design, the matching module 32 is specifically configured to:
combining elements in the primitive set pairwise to form a sub-mode set _ L of the primitive set;
for each approximate primitive setCombining the elements in the set pairwise to form a sub-mode set L of an approximate primitive setiWherein, the value of i is 1, 2., and M represents the total number of the approximate primitive set;
and calculating the similarity between each element in the sub-mode set of the approximate primitive set and each element in the sub-mode set of the primitive set to obtain a second similarity of each approximate primitive set. Collection
In one possible design, the matching module 32 is specifically configured to:
and taking the sum of the first similarity set and the second similarity of the approximate primitive set as the total similarity of the approximate primitive set.
In one possible design, the identification module 33 is specifically configured to:
selecting an approximate primitive set with the maximum total similarity as a candidate set;
and if the total similarity of the candidate set is greater than a preset threshold, determining the distribution information of the target element on the current software interface according to the position relation of the interface elements in the candidate set.
In one possible design, further comprising: an execution module 34 configured to:
and executing the access operation on the target element according to the distribution information.
In one possible design, the matching module 32 is further configured to:
and performing similarity matching operation on the structural mode corresponding to the target element and the structural mode corresponding to each interface element in the current software interface.
The apparatus provided in this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.
In this embodiment, similarity calculation is performed through a structural mode and an interface element based on a target element; determining the distribution information of the target elements on the current software interface according to the similarity operation result; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Based on the embodiment shown in fig. 4, fig. 5 is a schematic structural diagram of an apparatus for identifying a software interface element combining an RPA and an AI according to another exemplary embodiment, and as shown in fig. 5, the apparatus for identifying a software interface element combining an RPA and an AI according to this embodiment further includes:
the obtaining module 35 is configured to extract all interface elements of the template software interface as candidate elements;
and selecting a target element from the candidate elements, and acquiring a structural mode corresponding to the target element.
In one possible design, the obtaining module 35 is further configured to:
acquiring a structural element associated with the target element;
generating a candidate mode of the target element according to the position relation between the target element, the structural element, the target element and the structural element;
if the position relation corresponding to the candidate mode in the template software interface is determined not to be unique, reselecting the structural element associated with the target element;
and determining the position relation of the candidate mode in the template software interface, and taking the candidate mode as a structural mode corresponding to the target element.
In one possible design, the obtaining module 35 is further configured to:
intercepting an interface image of a template software interface;
and extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the structural elements include: any one or more of an icon element, a text element, a key element.
The apparatus provided in this embodiment may be used to implement the technical solutions of the method embodiments shown in fig. 2 and fig. 3, and the implementation principles and technical effects are similar, which are not described herein again.
In this embodiment, similarity calculation is performed through a structural mode and an interface element based on a target element; determining the distribution information of the target elements on the current software interface according to the similarity operation result; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Fig. 6 is a schematic structural diagram of an electronic device shown in the present application according to an example embodiment. As shown in fig. 6, the present embodiment provides an electronic device 40, including:
a processor 401; and the number of the first and second groups,
a memory 402 for storing executable instructions of the processor, which may also be a flash (flash memory);
wherein the processor 401 is configured to perform the respective steps of the above-described method via execution of executable instructions. Reference may be made in particular to the description relating to the preceding method embodiment.
Alternatively, the memory 402 may be separate or integrated with the processor 401.
When the memory 402 is a device independent of the processor 401, the electronic device 40 may further include:
a bus 403 for connecting the processor 401 and the memory 402.
The present embodiment also provides a readable storage medium, in which a computer program is stored, and when at least one processor of the electronic device executes the computer program, the electronic device executes the methods provided by the above various embodiments.
The present embodiment also provides a program product comprising a computer program stored in a readable storage medium. The computer program can be read from a readable storage medium by at least one processor of the electronic device, and the execution of the computer program by the at least one processor causes the electronic device to implement the methods provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (17)

1. A method for identifying software interface elements in combination with RPA and AI, comprising:
extracting interface elements in a current software interface;
performing similarity operation based on the structural mode of the target element and the interface element;
and determining the distribution information of the target element on the current software interface according to the similarity operation result.
2. The method of claim 1, wherein extracting interface elements in the current software interface comprises:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
3. The method of claim 1, wherein the structural patterns comprise: a primitive set consisting of target elements and structural elements; and a position relation set of the position relation between every two elements in the primitive set.
4. The method of claim 3, wherein performing similarity operations based on the structural schema of the target element and the interface element comprises:
determining all approximate primitive sets in the current software interface according to the primitive sets;
for each approximate primitive set, obtaining a first similarity set of each approximate primitive set based on the similarity between each element in the primitive set and each element in the primitive set;
determining a second similarity of each approximate primitive set based on the position relationship between every two elements in each approximate primitive set and the similarity of the position relationship between every two corresponding elements in the primitive set;
and determining the total similarity of the primitive set and each approximate primitive set based on the first similarity set and the second similarity.
5. The method of claim 4, wherein determining from the primitive set all approximate primitive sets in a current software interface comprises:
searching interface elements matched with target elements in the primitive set to form a first interface element set corresponding to the target elements;
respectively searching interface elements matched with all structural elements in the primitive set to form a second interface element set corresponding to all the structural elements; wherein each structural element in the primitive set corresponds to an independent second interface element set;
and respectively selecting any interface element from the first interface element set and each second interface element set to form the approximate primitive set.
6. The method of claim 4, wherein the obtaining, for each approximate primitive set, a first similarity set for each approximate primitive set based on a similarity of each element to each element in the primitive set comprises:
and obtaining the similarity between a first interface element in the approximate primitive set and a target element in the primitive set, and the similarity between each second interface element in the approximate primitive set and each corresponding structural element in the primitive set, so as to obtain the first similarity set of the approximate primitive set.
7. The method of claim 4, wherein determining the second similarity of each approximate primitive set based on the similarity of the position relationship between two elements in each approximate primitive set and the position relationship between corresponding two elements in the primitive set comprises:
combining elements in the primitive set pairwise to form a sub-mode set of the primitive set;
combining elements in each approximate primitive set pairwise to form a sub-mode set of the approximate primitive set;
and calculating the similarity between each element in the sub-mode set of the approximate primitive set and each element in the sub-mode set of the primitive set to obtain a second similarity set of each approximate primitive set.
8. The method according to claim 4, wherein the determining, according to the result of the similarity operation, distribution information of the target element on the current software interface includes:
selecting an approximate primitive set with the maximum total similarity as a candidate set;
and if the total similarity of the candidate set is greater than a preset threshold, determining the distribution information of the target element on the current software interface according to the position relation of the interface elements in the candidate set.
9. The method according to any one of claims 1 to 8, wherein after determining the distribution information of the target element on the current software interface according to the similarity operation result, the method further comprises:
and executing the access operation on the target element according to the distribution information.
10. The method according to any one of claims 1-8, wherein performing similarity operations based on the structural schema of the target element and the interface element further comprises:
and performing similarity matching operation on the structural mode corresponding to the target element and the structural mode corresponding to each interface element in the current software interface.
11. The method according to any one of claims 1-8, wherein before performing the similarity operation based on the structural model of the target element and the interface element, further comprising:
extracting all interface elements of the template software interface as candidate elements;
and selecting a target element from the candidate elements, and acquiring a structural mode corresponding to the target element.
12. The method of claim 11, wherein the selecting a target element from the candidate elements and obtaining a structural mode corresponding to the target element comprises:
acquiring a structural element associated with the target element;
generating a candidate mode of the target element according to the position relation between the target element, the structural element, the target element and the structural element;
if the position relation corresponding to the candidate mode in the template software interface is determined not to be unique, reselecting the structural element associated with the target element;
and determining the position relation of the candidate mode in the template software interface, and taking the candidate mode as a structural mode corresponding to the target element.
13. The method of claim 11, wherein extracting all interface elements of the template software interface as candidate elements comprises:
intercepting an interface image of a template software interface;
and extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
14. The method of claim 9, wherein the structural elements comprise: any one or more of an icon element, a text element, a key element.
15. An apparatus for identifying software interface elements in combination with RPA and AI, comprising:
the extraction module is used for extracting interface elements in the current software interface;
the matching module is used for carrying out similarity operation based on the structural mode of the target element and the interface element;
and the identification module is used for determining the distribution information of the target element on the current software interface according to the similarity operation result.
16. An electronic device, comprising:
a processor; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of identifying a software interface element in conjunction with an RPA and an AI of any one of claims 1 to 14 via execution of the executable instructions.
17. A storage medium on which a computer program is stored, which program, when executed by a processor, implements the method of identifying a software interface element that combines RPA and AI of any of claims 1 to 14.
CN202011126611.2A 2019-12-23 2020-10-20 Software interface element identification method and device combining RPA and AI Pending CN112231034A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019113390126 2019-12-23
CN201911339012 2019-12-23

Publications (1)

Publication Number Publication Date
CN112231034A true CN112231034A (en) 2021-01-15

Family

ID=74117545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011126611.2A Pending CN112231034A (en) 2019-12-23 2020-10-20 Software interface element identification method and device combining RPA and AI

Country Status (1)

Country Link
CN (1) CN112231034A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114360038A (en) * 2022-03-22 2022-04-15 杭州实在智能科技有限公司 Weak supervision RPA element identification method and system based on deep learning
CN115495055A (en) * 2022-11-03 2022-12-20 杭州实在智能科技有限公司 RPA element matching method and system based on interface region identification technology
CN116051868A (en) * 2023-03-31 2023-05-02 山东大学 Interface element identification method for windows system
CN116627807A (en) * 2023-05-12 2023-08-22 南京数睿数据科技有限公司 Mobile application test repair method integrating interface element semantics and structural information
WO2024066067A1 (en) * 2022-09-30 2024-04-04 北京弘玑信息技术有限公司 Method for positioning target element on interface, medium, and electronic device
CN118642810A (en) * 2024-08-14 2024-09-13 深圳市客一客信息科技有限公司 Intelligent RPA interaction method, device and system based on multi-mode visual retrieval

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204360A1 (en) * 2017-01-13 2018-07-19 International Business Machines Corporation Automatic data extraction from a digital image
CN108363599A (en) * 2018-01-12 2018-08-03 深圳壹账通智能科技有限公司 User interface shows recognition methods and terminal device
CN110046085A (en) * 2018-12-03 2019-07-23 阿里巴巴集团控股有限公司 The method and device of the application program control shown on identification terminal equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204360A1 (en) * 2017-01-13 2018-07-19 International Business Machines Corporation Automatic data extraction from a digital image
CN108363599A (en) * 2018-01-12 2018-08-03 深圳壹账通智能科技有限公司 User interface shows recognition methods and terminal device
CN110046085A (en) * 2018-12-03 2019-07-23 阿里巴巴集团控股有限公司 The method and device of the application program control shown on identification terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李国徽 等: "基于函数依赖的结构匹配方法", 软件学报, vol. 20, no. 10, 15 October 2009 (2009-10-15), pages 2667 - 2678 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114360038A (en) * 2022-03-22 2022-04-15 杭州实在智能科技有限公司 Weak supervision RPA element identification method and system based on deep learning
WO2024066067A1 (en) * 2022-09-30 2024-04-04 北京弘玑信息技术有限公司 Method for positioning target element on interface, medium, and electronic device
CN115495055A (en) * 2022-11-03 2022-12-20 杭州实在智能科技有限公司 RPA element matching method and system based on interface region identification technology
CN115495055B (en) * 2022-11-03 2023-09-08 杭州实在智能科技有限公司 RPA element matching method and system based on interface region identification technology
CN116051868A (en) * 2023-03-31 2023-05-02 山东大学 Interface element identification method for windows system
WO2024198177A1 (en) * 2023-03-31 2024-10-03 山东大学 Interface element identification method for windows system
CN116627807A (en) * 2023-05-12 2023-08-22 南京数睿数据科技有限公司 Mobile application test repair method integrating interface element semantics and structural information
CN116627807B (en) * 2023-05-12 2024-04-09 南京数睿数据科技有限公司 Mobile application test repair method integrating interface element semantics and structural information
CN118642810A (en) * 2024-08-14 2024-09-13 深圳市客一客信息科技有限公司 Intelligent RPA interaction method, device and system based on multi-mode visual retrieval

Similar Documents

Publication Publication Date Title
CN112231034A (en) Software interface element identification method and device combining RPA and AI
CN112231033A (en) Software interface element matching method and device combining RPA and AI
US11279040B2 (en) Robot process automation apparatus and method for detecting changes thereof
US11126789B2 (en) Method to convert a written procedure to structured data, and related systems and methods
US20160098615A1 (en) Apparatus and method for producing image processing filter
CN113742205B (en) Code vulnerability intelligent detection method based on man-machine cooperation
CN112749758A (en) Image processing method, neural network training method, device, equipment and medium
CN110209929B (en) Resume recommendation method and device, computer equipment and storage medium
CN109871891B (en) Object identification method and device and storage medium
CN111611395B (en) Entity relationship identification method and device
CN112465144A (en) Multi-modal demonstration intention generation method and device based on limited knowledge
Ben-Yosef et al. Full interpretation of minimal images
CN115268719B (en) Method, medium and electronic device for positioning target element on interface
CN114022684B (en) Human body posture estimation method and device
US12112513B2 (en) System and method for identifying non-standard user interface object
CN118519661B (en) Application program updating method and related device
CN115035367A (en) Picture identification method and device and electronic equipment
US20200394460A1 (en) Image analysis device, image analysis method, and image analysis program
CN116805522A (en) Diagnostic report output method, device, terminal and storage medium
Zrira et al. Evaluation of PCL's Descriptors for 3D Object Recognition in Cluttered Scene
CN113688243A (en) Method, device and equipment for marking entities in sentences and storage medium
US20210326754A1 (en) Storage medium, learning method, and information processing apparatus
JP7403340B2 (en) A system that determines whether an object recognition model can be used.
CN112685056A (en) Script updating method and device
KR102619275B1 (en) Object search model and learning method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 1902, 19th Floor, China Electronics Building, No. 3 Danling Road, Haidian District, Beijing

Applicant after: BEIJING LAIYE NETWORK TECHNOLOGY Co.,Ltd.

Applicant after: Laiye Technology (Beijing) Co.,Ltd.

Address before: 1902, 19 / F, China Electronics Building, 3 Danling Road, Haidian District, Beijing 100080

Applicant before: BEIJING LAIYE NETWORK TECHNOLOGY Co.,Ltd.

Country or region before: China

Applicant before: BEIJING BENYING NETWORK TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information