KR102177550B1

KR102177550B1 - Method of automatically recognizing and classifying information of design in imaged PID drawings

Info

Publication number: KR102177550B1
Application number: KR1020180152246A
Authority: KR
Inventors: 강성오; 백흠경; 이을범
Original assignee: 도프텍(주); 포항공과대학교 산학협력단
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2020-11-11
Also published as: KR20200065613A

Abstract

본 발명은 짧은 시간에 높은 정확도로 설계 요소 집계하여 설계 정보를 디지털화하기 위해 이미지화된 P&ID 도면을 자동으로 Digital화 하여 설계정보를 인식하고 분류하는 방법을 제공하는데 주된 목적이 있다.
상기한 목적을 달성하기 위해 이미지화된 P&ID 도면에서 설계 정보를 자동으로 인식하여 분류하는 방법으로서, 상기 이미지화된 P&ID 도면에서 심볼영역을 추출하고 상기 심볼영역에서 해당하는 심볼(Symbol)의 원점과 연결점을 설정한 후 심볼을 데이터베이스에 자동으로 등록하는 단계; 상기 이미지화된 P&ID 도면에서 상기 사전 등록한 심볼을 인식하여 추출하고 추출된 심볼은 상기 이미지화된 P&ID 도면에서 제거하는 심볼을 인식하여 추출하는 단계; 상기 심볼이 제거된 이미지화된 P&ID 도면에서 픽셀 단위가 아닌 Blob단위로 계산하는 슬라이딩 윈도우(Sliding Window) 방법을 이용하여 라인을 인식하여 추출하는 단계; 상기 심볼이 제거된 이미지화된 P&ID 도면에서 종횡비(Aspect Ratio)를 계산하여 텍스트가 존재하는 영역을 계산한 후 해당 영역에서 텍스트를 인식하여 추출하는 단계; 상기 추출된 텍스트들 중 Drawing 영역에서 검출되는 텍스트는 사전 정의된 속성 분류 체계를 통해 각각의 속성으로 분류하는 단계; 상기 추출된 심볼, 라인을 가장 인접한 거리 기준으로 상기 추출된 텍스트의 속성과 연계시키고, 상기 추출된 심볼이 장치(equipment)인 경우, 상기 텍스트에서 인식된 장치 이름을 기준으로 연계시키는 단계;를 포함한다.The main object of the present invention is to provide a method for recognizing and classifying design information by automatically digitizing an imaged P&ID drawing to digitize design information by counting design elements with high accuracy in a short time.
As a method of automatically recognizing and classifying design information in an imaged P&ID drawing to achieve the above object, a symbol area is extracted from the imaged P&ID drawing, and the origin and connection point of the corresponding symbol in the symbol area are identified. Automatically registering a symbol in a database after setting; Recognizing and extracting the pre-registered symbol from the imaged P&ID drawing, and recognizing and extracting the extracted symbol from the imaged P&ID drawing; Recognizing and extracting a line from the imaged P&ID drawing from which the symbol has been removed using a sliding window method that calculates a blob unit instead of a pixel unit; Calculating an aspect ratio from the imaged P&ID drawing from which the symbols have been removed to calculate an area where the text exists, and then recognizing and extracting the text from the corresponding area; Classifying the text detected in the drawing area among the extracted texts into respective attributes through a predefined attribute classification system; Associating the extracted symbol and line with the attribute of the extracted text based on the nearest distance, and associating the extracted symbol based on the device name recognized in the text when the extracted symbol is an equipment. do.

Description

{Method of automatically recognizing and classifying information of design in imaged PID drawings}

본 발명은 이미지 P&ID에서 설계 정보를 자동으로 인식하여 저장하는 방법으로서, 상세하게는 이미지 P&ID 도면의 심볼(Symbol), 라인(line), 텍스트(text)를 인식한 다음, 인식된 설계 정보를 분류 및 연계하여 저장하는 방법에 관한 것이다.The present invention is a method of automatically recognizing and storing design information in an image P&ID. In detail, after recognizing a symbol, line, and text of an image P&ID drawing, the recognized design information is classified. And it relates to a method of storing in conjunction.

통상적으로 엔지니어링 기업과 정유업체, 화학업체 등은 설계도면을 AutoCAD등의 CAD 도면 형식과 PDF 또는 Hard Copy 형식으로 보유하고 있다. 최근 4차산업혁명으로 AI 및 빅데이터 기술이 확산되고 있는 상황에서 조선산업 및 플랜트 엔지니어링 산업 전반에 상기 기술을 적용하기 위해서는 상기 PDF 또는 Hard Copy 형식으로 보유하고 있는 데이터의 디지털화가 필수적으로 필요하다.Typically, engineering companies, oil refining companies, chemical companies, etc. hold design drawings in CAD drawing formats such as AutoCAD and PDF or Hard Copy format. In a situation where AI and big data technologies are spreading due to the recent 4th industrial revolution, digitalization of the data held in the PDF or Hard Copy format is essential in order to apply the technology to the shipbuilding industry and the plant engineering industry as a whole.

시공사 종래 설계 도면의 디지털화 과정을 살펴보면, 먼저, 고객사로부터 설계 작성된 PNG, JPG, PDF등 이미지 파일화된 P&ID (이하 "이미지화된 P&ID"라 함) 도면이 시공사로 접수되면, 시공사에서는 설계 엔지니어가 상기 이미지 파일화 된 P&ID 도면을 새로운 P&ID 도면으로 작성하게 된다.Looking at the digitization process of the conventional design drawings by the contractor, first, when the P&ID (hereinafter referred to as "imaged P&ID") that is designed and created by the customer is an image file (hereinafter referred to as "imaged P&ID"), the design engineer is reminded The image filed P&ID drawing is created as a new P&ID drawing.

여기서, 상기 P&ID란 “Piping & Instrument Drawing”의 약어로써, 어떤 공정의 설비나 배관, 전기계장 등을 일목요연하게 다이어그램 형식으로 표현한 공정흐름도를 의미하며, P&ID를 작성하는 설계 프로그램으로는 인터그래프(Intergraph)사의 SP P&ID(SmartPlant P&ID), 아베바(Aveva)사의 Aveva P&ID, 오토데스크(AutoDesk)사의 AutoCAD Plant P&ID 등이 있다.Here, the P&ID is an abbreviation of “Piping & Instrument Drawing”, which means a process flow diagram in which facilities, piping, and electrical instrumentation of a certain process are clearly expressed in a diagram format. As a design program for creating P&ID, Intergraph )'S SP P&ID (SmartPlant P&ID), Aveva's Aveva P&ID, and AutoDesk's AutoCAD Plant P&ID.

상기 시공사에서 작성되는 P&ID 도면에는 자재 및 물량이 기재되며, 이러한 도면을 기초로 자재 물량을 엔지니어가 컴퓨터에서 실행되는 자재산출 프로그램을 통해 물량 데이터를 집계하여 자재명세서(BOM)를 생성하고, 자재명세서를 통해 견적을 산출한다. The material and quantity are described in the P&ID drawing created by the above construction company, and based on these drawings, the engineer collects the quantity data through a material calculation program run on a computer to create a bill of materials (BOM). Calculate the estimate through

그런데, 상기한 종래 설계도면의 디지털화 과정은 디지털화 된 설계 도면과 이미지화된 도면을 육안으로 비교하여 오류를 점검하는 정도로 밖에 활용되지 못하였고, 이는 가장 중요한 정보이면서도 설계의 자동화 등을 위해 필요한 데이터를 활용하는데 상당한 문제가 있었다. 즉, 이미지화된 P&ID 도면과 작성된 새로운 도면들을 일일이 비교하면서 이미지화된 P&ID 도면상에 기재된 자재 중 새로운 도면으로 작성되지 않고 누락된 것이 있는지, 산출하지 않은 아이템이 있는지 등을 일일이 확인해야만 하는 설계 Product간 상호 정합성 체크의 어려움의 문제점이 있다.However, the digitization process of the conventional design drawing described above was only used to check errors by comparing the digitized design drawing and the imaged drawing with the naked eye.This is the most important information and data necessary for automation of design, etc. There was a considerable problem. In other words, while comparing the imaged P&ID drawing with the new drawing created, it is necessary to check whether there is any missing or uncalculated item among the materials listed on the imaged P&ID drawing. There is a problem of the difficulty of checking the consistency.

또한, 새로운 도면을 작성하면서 밸브 등을 포함한 모든 자재 아이템을 다 작성해야만 하므로 새로운 도면의 작성 및 물량을 기입하는데 불필요한 많은 시간이 소요되며, 배관의 레이아웃 등이 변경되는 경우 산출 자재의 수정 및 그 업데이트 관리가 매우 어려운 문제점이 있다. In addition, since all material items including valves must be created while creating a new drawing, it takes a lot of unnecessary time to create a new drawing and fill in the quantity, and if the piping layout is changed, the calculation material is modified and updated. There is a problem that is very difficult to manage.

이러한 문제점으로 인해 실제 기업체에서는 대부분 산출 결과에 대한 체크를 면밀하게 수행하지 못하고 있는 실정으로서, 자재 산출에 대한 정확성 및 신뢰성이 현저하게 낮은 수준이며, 이러한 자재 물량산출의 오차로 인해 공사에 지연을 초래하는 등 많은 문제점을 야기시키고 있다.Due to this problem, most companies are not able to carefully check the results of calculations, and the accuracy and reliability of material calculations are remarkably low, resulting in delays in construction due to errors in the calculation of material quantities. It causes many problems such as

상기의 종래의 방법에 따라 이미지 형식의 기존 도면 자료의 디지털화 지원 기술 수요가 증가하고 관련 산업 파급성은 높지만, 단기간 수익 창출이 어려워 도면 자동인식 기술의 개발이 더딘 실정이다.According to the above-described conventional method, the demand for technology to support digitization of existing drawing data in image format is increasing and the ripple effect of related industries is high, but the development of automatic drawing recognition technology is slow due to difficulty in generating profit in a short period of time.

본 발명은 이러한 종래의 문제점을 해결하기 위하여 개발된 것으로서, 설계의 FEED(Front-End Engineering Design)과정 중 Equipment와 Symbol의 디지털화 및 견적 산출에 있어서, 이미지화된 P&ID 도면에서 자동으로 설계정보를 인식 및 추출하여 보다 정확하고 신속하게 디지털화할 수 있고, 이 자동인식된 설계 정보를 이용해 P&ID 설계 도면 작성에 효과적으로 활용할 수 있도록 하는데 주된 목적이 있다.The present invention was developed to solve such a conventional problem, and in the process of digitizing equipment and symbols and calculating estimates during the front-end engineering design (FEED) process of design, design information is automatically recognized from imaged P&ID drawings and Its main purpose is to extract and digitize it more accurately and quickly, and use this automatically recognized design information to effectively use it in creating P&ID design drawings.

상기 목적을 달성하기 위한 본 발명은 이미지화된 P&ID 도면에서 설계 정보를 자동으로 인식하여 분류하는 상기 목적을 달성하기 위한 본 발명은 이미지화된 P&ID 도면에서 설계 정보를 자동으로 인식하여 분류하는 방법으로서, 상기 이미지화된 P&ID 도면에서 라인(line)과 텍스트(text)를 제거한 후 심볼영역을 추출하고 해당 심볼영역에서 심볼(symbol)과 심볼의 원점, 연결점을 데이터베이스에 자동으로 등록하는 단계; 상기 이미지화된 P&ID 도면에서 상기 사전 등록한 심볼을 4방향으로 인식하여 추출하고 추출된 심볼은 상기 이미지화된 P&ID 도면에서 제거하는 단계; 상기 심볼이 제거된 이미지화된 P&ID 도면에서 Trim line을 제거하고 Sliding Window 방법을 이용하여 라인을 인식하여 추출하는 단계; 상기 심볼이 제거된 이미지화된 P&ID 도면에서 Aspect Ratio를 계산하여 텍스트가 존재하는 영역을 계산한 후 해당 영역에서 OCR로 텍스트를 인식하여 추출하는 단계; 상기 추출된 텍스트들 중 Drawing 영역에서 검출되는 텍스트는 사전 정의된 속성 분류 체계를 통해 각각의 속성으로 분류하는 단계; 상기 추출된 심볼, 라인이 가지는 속성 값과 상기 텍스트에서 해당 심볼, 라인과 가장 인접한 거리의 분류된 속성을 연계시키고, 상기 추출된 심볼 중 장치의 경우, 상기 텍스트에서 인식된 장치 이름을 기준으로 연계시키는 단계;를 포함한다.The present invention for achieving the above object is a method for automatically recognizing and classifying design information in an imaged P&ID drawing, the present invention for automatically recognizing and classifying design information in an imaged P&ID drawing, Extracting a symbol area after removing lines and text from the imaged P&ID drawing, and automatically registering the symbol, the origin and connection points of the symbol in the corresponding symbol area in a database; Recognizing and extracting the pre-registered symbols from the imaged P&ID drawing in four directions, and removing the extracted symbols from the imaged P&ID drawing; Removing the trim line from the imaged P&ID drawing from which the symbol has been removed, and recognizing and extracting the line using a sliding window method; Calculating an aspect ratio from the imaged P&ID drawing from which the symbol has been removed to calculate an area where the text exists, and then recognizing and extracting the text from the corresponding area with OCR; Classifying the text detected in the drawing area among the extracted texts into respective attributes through a predefined attribute classification system; Link the attribute values of the extracted symbol and line with the classified attribute of the closest distance to the symbol and line in the text, and, in the case of a device among the extracted symbols, based on the device name recognized in the text It includes;

또한, 상기 추출된 설계정보를 연계시키는 단계에 이어 심볼을 라인의 Flow mark 순서대로 재배열하여 토폴로지를 생성하는 단계를 더 포함할 수 있다.In addition, following the step of linking the extracted design information, a step of rearranging symbols in order of flow mark of lines to generate a topology may be further included.

또한, 상기 추출된 설계정보 및 토폴로지를 호환 가능한 XML 형식의 중간 파일로 생성할 수 있다.In addition, the extracted design information and topology can be generated as an intermediate file in a compatible XML format.

또한, 부가 심볼이 등록된 심볼을 우선 검사하고, 심볼 중 장치(equipment)를 먼저 검사하고, 장치 영역 주변에서 노즐(nozzle)을 찾아 추출하여 인식률을 높일 수 있다.In addition, a symbol in which an additional symbol is registered is first inspected, an equipment among symbols is first inspected, and a nozzle is found and extracted around a device area to increase a recognition rate.

또한, 상기 심볼을 인식하는 단계에서 인식된 심볼과 저장된 심볼의 특징점을 비교하여 인식된 심볼의 일치도가 설정한 임계값보다 높은 경우에만 저장된 심볼로 인식하는 것을 특징으로 할 수 있다.In addition, in the step of recognizing the symbol, the recognized symbol may be compared with the feature points of the stored symbol, and recognized as a stored symbol only when a matching degree of the recognized symbol is higher than a set threshold.

또한, 상기 라인을 인식하여 추출하는 단계에서 상기 추출된 심볼의 연결점과 라인, 라인과 라인이 연결되어있지만 좌표가 다른 경우, 픽셀단위로 좌표 보정하여 라인을 추출할 수 있다.In addition, in the step of recognizing and extracting the line, when a connection point and a line, and a line and a line of the extracted symbol are connected but the coordinates are different, the line may be extracted by correcting the coordinates in pixel units.

또한, 상기 텍스트를 인식하는 방법으로 인식하지 못한 경우, 해당 텍스트 이미지를 저장하고 저장된 이미지에서 문자 매핑하여 OCR을 Training하여 인식률을 높일 수 있다.In addition, when the text is not recognized by the method of recognizing the text, a corresponding text image may be stored and text mapped from the stored image to train OCR to increase the recognition rate.

상기 방법을 거쳐 자동으로 도면을 디지털화한다면, 짧은 시간에 높은 정확도로 설계 요소 집계하여 도면의 생성, 자재산출, 기본 설계 정보인 Equipment 리스트, Line 리스트, Instrument 리스트 산출등의 대부분 업무를 자동으로 생성하는 것이 가능하며, 고급 엔지니어들이 수작업으로 설계 요소를 계산하는 단순하고 반복된 작업을 배제하여 업무 생산성 향상에 도움이 된다.If drawings are automatically digitized through the above method, most tasks such as drawing generation, material calculation, basic design information such as equipment list, line list, instrument list calculation, etc. are automatically generated by counting design elements with high accuracy in a short time. It is possible, and it helps to improve work productivity by eliminating the simple and repetitive task of manually calculating design elements by advanced engineers.

또한, 이미지화된 P&ID에서 직접 도면을 그렸던 기존의 방법보다 상기 방법으로 생성된 데이터를 활용하여 도면을 자동으로 생성하게 된다면, 설계 Product 정합성이 유지되어 설계 품질을 향상시킬 수 있다. 이는 종래에 육안으로 일일이 확인하며 도면을 그렸던 것으로부터 발생하던 플랜트 엔지니어링 업체의 시간 낭비, 항목 누락, 오기등의 문제점을 해결해준다.In addition, if the drawing is automatically generated using the data generated by the above method, rather than the conventional method in which the drawing was drawn directly from the imaged P&ID, design product consistency is maintained and design quality can be improved. This solves the problems such as waste of time, omission of items, and errors of plant engineering companies that have been caused by the conventional visual inspection and drawing of drawings.

또한, 3D 모델링 후 정확도를 검증할 때에 있어서, 저장된 데이터와 비교할 수 있으므로 신속하며 정확하게 작성된 도면의 정확도를 검증할 수 있다.In addition, when verifying the accuracy after 3D modeling, it is possible to quickly and accurately verify the accuracy of the created drawing because it can be compared with the stored data.

도 1은 본 발명의 이미지화된 P&ID 도면의 실시예를 도시한 도면.
도 2는 본 발명에 따른 분류방법을 설명한 순서도.
도 3은 본 발명의 심볼 등록에 있어서 원점과 연결점을 설정하는 단계를 나타낸 예시도.
도 4는 본 발명의 설계정보 중 심볼을 인식하는 단계를 나타낸 예시도.
도 5는 본 발명의 이미지화된 P&ID 도면의 심볼을 제거한 후의 실시예를 도시한 도면.
도 6은 본 발명의 설계정보 중 라인을 인식하는 단계를 나타낸 예시도.
도 7은 본 발명의 설계정보 중 텍스트를 인식하는 단계에서 텍스트 영역을 추출한 것을 나타낸 예시도.
도 8은 본 발명의 추출된 심볼과 라인을 설계 정보와 연계하는 단계를 나타낸 예시도.
도 9는 본 발명의 Flow mark 순서대로 재배열하는 단계를 나타낸 예시도.
도 10은 본 발명의 중간파일로 생성되는 데이터베이스의 Hierarchy 구조를 나타낸 도면.1 is a diagram showing an embodiment of an imaged P&ID diagram of the present invention.
2 is a flowchart illustrating a classification method according to the present invention.
3 is an exemplary view showing a step of setting an origin and a connection point in symbol registration of the present invention.
4 is an exemplary view showing a step of recognizing a symbol among design information of the present invention.
5 is a diagram showing an embodiment after removing the symbols of the imaged P&ID diagram of the present invention.
6 is an exemplary view showing a step of recognizing a line in design information of the present invention.
7 is an exemplary view showing that a text area is extracted in the step of recognizing text among design information of the present invention.
8 is an exemplary view showing a step of associating an extracted symbol and line with design information of the present invention.
Figure 9 is an exemplary view showing the steps of rearranging in order of the flow mark of the present invention.
10 is a diagram showing a hierarchy structure of a database created as an intermediate file of the present invention.

이하에서 첨부된 도면을 참조로 본 발명에 따른 이미지화된 P&ID 도면에서 설계 정보를 자동으로 인식하여 분류하는 방법을 보다 상세히 설명한다.Hereinafter, a method of automatically recognizing and classifying design information in an imaged P&ID drawing according to the present invention will be described in more detail with reference to the accompanying drawings.

그러나, 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 참고로, 본 발명을 설명함에 있어서 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.However, the present invention is not limited to the embodiments disclosed below, but may be implemented in a variety of different forms, only these embodiments are intended to complete the disclosure of the present invention, and are common in the technical field to which the present invention pertains. It is provided to fully inform the knowledgeable person of the scope of the invention, and the invention is only defined by the scope of the claims. For reference, when it is determined that detailed descriptions of known functions or configurations related to the present invention may unnecessarily obscure the subject matter of the present invention, detailed descriptions thereof will be omitted.

도 1은 이미지화된 P&ID 도면의 실시예로 심볼, 라인, 텍스트가 모두 포함된 도면이다. 상기 심볼이란 Drawing 영역에서 라인과 텍스트를 제외하고 자재를 도식화한 것으로, 장치(Equipment), 계장(Instrument), 피팅류(Fitting), OPC(Operation page connection)등으로 이루어져있다. 도1은 밸브(100), 계장(110), 장치(120), 장치에 부착되어 있는 노즐(Nozzle)(130), OPC(140)등의 심볼로 구성되어 있다. 상기 OPC는 상기 도면과 연결된 타 도면을 표시하는 심볼로, 프로세스의 방향을 나타내는 방향 표시와 함께, 상기 OPC 내부에는 연결되는 P&ID 도면의 번호가 기재된다. 상기 라인은 도1에서 심볼들과 이어져있는 직선 부분으로, 프로세스 라인(Process line)과 유틸리티 라인(Utility line)으로 구성되어 있다. 상기 프로세스 라인(Process line)은 플랜트의 메인 작업이 이루어지는 배관라인이고, 상기 유틸리티 라인(Utility line)은 전기신호, 제어 라인 등 프로세스 라인의 동작을 돕기위한 라인이다. 상기 텍스트는 상기 심볼 및 라인을 설명하는 부분으로 장치를 설명하기 위한 텍스트(150), 라인을 설명하는 Line Number(160)등이 있다.1 is an embodiment of an imaged P&ID drawing, in which symbols, lines, and text are all included. The symbol is a diagram of materials excluding lines and texts in the drawing area, and consists of equipment, instrumentation, fittings, and operation page connection (OPC). 1 is composed of symbols such as a valve 100, an instrument 110, a device 120, a nozzle 130 attached to the device, and an OPC 140. The OPC is a symbol indicating another drawing connected to the drawing, and a direction indicating a direction of a process and a number of a connected P&ID drawing are written inside the OPC. The line is a straight line connected to symbols in FIG. 1 and is composed of a process line and a utility line. The process line is a piping line in which the main work of the plant is performed, and the utility line is a line for helping the operation of a process line such as an electric signal and a control line. The text is a part that describes the symbol and line, and includes text 150 for describing the device and a line number 160 for describing the line.

앞서 설명한 바와 같이 종래에는 이미지 파일, 통상적으로 PDF 파일 형태로 된 P&ID 도면을 설계 엔지니어들이 일일이 새로운 P&ID 도면으로 재작성했고, 이에 따라 설계 Product간 데이터가 불일치하며 FEED(Front-End Engineering Design) 데이터를 조기에 Setup하는데 있어서 불필요한 시간이 소요되었고, 이는 특히 해외 Project를 진행하는 데에 있어서 큰 문제점을 야기하였다.As described above, in the past, design engineers recreated each P&ID drawing in the form of an image file, usually a PDF file, into a new P&ID drawing. Accordingly, data between design products is inconsistent and FEED (Front-End Engineering Design) data is It took unnecessary time to set up early, and this caused a big problem, especially in conducting overseas projects.

이러한 문제점을 해결하고 초기 FEED 과정에서 짧은 시간, 높은 정확도를 통해 설계 요소를 집계하여 견적업무에 활용하고, 고급 엔지니어들의 단순 반복 작업을 배제하여 업무 생산성을 향상하며 설계 품질을 개선하기 위해 본 발명은 다음과 같은 자동 분류방법을 제공한다.In order to solve these problems, aggregate design elements through a short time and high accuracy in the initial FEED process, and utilize them for estimating work, and to improve work productivity and design quality by excluding simple repetitive tasks of advanced engineers, the present invention It provides the following automatic classification method.

도 2는 본 발명에 따른 이미지화된 P&ID 도면에서 설계 정보를 자동으로 인식하여 분류하는 방법의 순서도이다.2 is a flowchart of a method for automatically recognizing and classifying design information in an imaged P&ID drawing according to the present invention.

먼저, 심볼 예상 영역에서 인식 대상이 되는 심볼을 등록한다(S200). 여기서 심볼 예상 영역은 전체 도면에서 Contour 알고리즘을 통해 자동으로 추출한다. 자동 추출된 심볼 영역에서 사전에 정의한 분류체계에 따라 심볼 리스트를 작성하고, 이를 바탕으로 데이터베이스에 심볼을 등록한다.First, a symbol to be recognized is registered in the expected symbol area (S200). Here, the symbol predicted area is automatically extracted from the entire drawing through the contour algorithm. A symbol list is created according to the classification system defined in advance in the automatically extracted symbol area, and the symbol is registered in the database based on this.

상기 심볼 예상 영역을 추출하는 Contour 알고리즘은 동일한 색 또는 색상 강도를 가진 부분의 가장 자리 경계를 연결하여 추출하는 것으로 본 발명에서는 도면의 여백과 심볼영역을 구별하여 심볼 예상 영역을 추출한다. 도면에서 사용자가 인식할 심볼들을 하나씩 등록하는 것은 시간이 많이 소요되는 비효율적인 작업으로, 심볼 예상 영역을 추출하는 것은 소요되는 시간을 줄이기 위함이다.The contour algorithm for extracting the symbol predicted area is to extract the symbol predicted area by connecting the edge borders of portions having the same color or color intensity. In the present invention, the symbol predicted area is extracted by distinguishing the margin of the drawing from the symbol area. Registering the symbols to be recognized by the user one by one in the drawing is an inefficient operation that takes a lot of time, and extracting the expected symbol area is to reduce the time required.

도 3은 심볼을 데이터베이스에 등록할 때에 있어서, 심볼에 심볼의 원점과 연결점 등 연결정보를 설정하는 것을 나타낸 예시도이다. 해당 예시도의 심볼은 밸브 심볼로서, 중앙의 적색 점(300)이 원점, 양측의 청색 점(310)이 연결점이다. 심볼에 연결정보를 설정하는 이유는 인식된 심볼의 연결점의 좌표를 통해 라인을 인식할 때 시작점을 설정할 수 있고, 이러한 연계 정보를 이용하여 향후 P&ID 설계도면을 자동으로 작성하는데 이용할 수 있기 때문이다.3 is an exemplary diagram illustrating setting connection information such as an origin and a connection point of a symbol in a symbol when registering a symbol in a database. The symbol in the exemplary diagram is a valve symbol, where the red dot 300 at the center is the origin and the blue dots 310 at both sides are the connection points. The reason for setting the connection information on the symbol is that the starting point can be set when a line is recognized through the coordinates of the connection point of the recognized symbol, and this connection information can be used to automatically create a future P&ID design drawing.

상기 심볼을 등록하는 단계에서 심볼이 복수개의 부가 심볼로 집합을 이루고 있는 경우, 상기 집합을 이루고 있는 부가 심볼들을 포함하여 하나의 심볼로 등록하는 것을 특징으로 할 수 있다. 이로 인해 심볼의 특징이 뚜렷하게 되어 후행하는 단계인 심볼을 인식하는데 있어 인식률이 높아지므로 크기가 작은 부분을 하나의 심볼로 등록하지 않고 집합을 이루고 있는 부가 심볼들을 하나의 심볼로 등록한다.In the step of registering the symbol, when a symbol is formed as a set of a plurality of additional symbols, it may be characterized in that the symbol is registered as one symbol including the additional symbols forming the set. As a result, since the characteristics of the symbol become clear and the recognition rate increases in recognizing the symbol, which is a subsequent step, a small portion is not registered as one symbol, and additional symbols forming a set are registered as one symbol.

다음으로, 상기 심볼이 저장된 데이터베이스를 바탕으로 이미지화된 P&ID 도면에서 심볼을 인식하여 추출한다(S210). 이 때, 인식된 심볼은 이미지화된 P&ID 도면에서 제거한다. 이를 통해, 후행 심볼을 인식하는데 걸리는 시간을 줄일 수 있고 오인식률도 낮출 수 있고, 후행 단계인 라인을 인식할 때에 있어서 심볼을 라인으로 잘못 인식하는 경우를 방지할 수 있다.Next, a symbol is recognized and extracted from the imaged P&ID drawing based on the database in which the symbol is stored (S210). At this time, the recognized symbol is removed from the imaged P&ID drawing. Through this, it is possible to reduce the time it takes to recognize a subsequent symbol, to reduce a false recognition rate, and to prevent a case of erroneously recognizing a symbol as a line when recognizing a line that is a subsequent step.

심볼을 인식하는 방법으로 도면을 0, 90, 180, 270도로 회전시켜 심볼 하나씩 도면에서 인식하는 방법을 사용할 수 있다. 4방향으로 심볼을 인식한다면 방향만 다른 상기 심볼이 등록된 데이터베이스에서 등록되지 않은 심볼까지도 인식할 수 있어 인식률을 높일 수 있다. 또한, 인식된 심볼이 제거되므로써, 심볼을 하나씩 검색하여도 크게 시간이 소모되지 않는다.As a method of recognizing symbols, a method of recognizing one symbol in a drawing by rotating the drawing by 0, 90, 180, 270 degrees can be used. If a symbol is recognized in four directions, it is possible to recognize even an unregistered symbol in a database in which the symbol differing only in the direction is registered, thereby increasing the recognition rate. In addition, since the recognized symbols are removed, it does not take much time even if the symbols are searched one by one.

상기 심볼을 인식하는 단계에서 인식된 심볼과 저장된 심볼의 특징점을 비교하여 인식된 심볼의 일치도가 사용자가 설정한 임계값보다 높은 경우에만 저장된 심볼로 인식하게 할 수 도 있다(320). 이 방법을 통해 사용자가 임의로 임계값을 설정해 인식속도와 인식률을 조절할 수 있다. In the step of recognizing the symbol, by comparing the recognized symbol with the feature points of the stored symbol, the symbol may be recognized as a stored symbol only when the matching degree of the recognized symbol is higher than a threshold value set by the user (320). Through this method, the user can arbitrarily set a threshold to adjust the recognition speed and recognition rate.

상기 심볼 인식단계에서 상기 부가 심볼이 등록된 심볼을 우선 검사하는 것을 특징으로 하여 심볼을 인식하고 도면에서 제거한다. 부가 심볼을 포함한 심볼을 우선적으로 인식하면 중복 인식을 줄여 인식률을 높일 수 있기 때문이다. 또한 기본 심볼을 먼저 인식하여 오인식 할 수 있는 것을 방지할 수 있다.In the symbol recognition step, the symbol in which the additional symbol is registered is first checked, and the symbol is recognized and removed from the drawing. This is because, if a symbol including an additional symbol is recognized first, it is possible to increase the recognition rate by reducing duplicate recognition. Also, it is possible to prevent misrecognition by recognizing the basic symbol first.

또한, 상기 심볼을 인식하는 단계에서 심볼의 크기, 도면의 복잡도에 따라 심볼의 이미지맵을 확대/축소하여 비교하여 심볼을 인식하여 추출할 수 있다.In addition, in the step of recognizing the symbol, the image map of the symbol may be enlarged/reduced and compared according to the size of the symbol and the complexity of the drawing, and the symbol may be recognized and extracted.

상기 심볼을 인식하여 추출하는 단계에서 심볼 중 장치(equipment)를 먼저 검사하고, 장치 영역 주변에서 노즐(nozzle)을 찾아 추출하는 것을 특징으로 할 수 있다. 일반적으로 노즐은 장치 주변에 위치하므로 장치 영역 주변에서 노즐을 찾으며, 노즐의 심볼은 작아서 전체 Drawing 영역에서 인식할 경우 오인식 확률이 높지만, 심볼 중 장치를 먼저 인식하고 이를 제거한 후 장치 주변에서 노즐을 찾아 추출한다면 노즐의 인식률이 높아져 전체 심볼의 인식률을 향상시킬 수 있다.In the step of recognizing and extracting the symbol, an equipment among the symbols is first checked, and a nozzle is found and extracted around the device area. In general, nozzles are located around the device, so they look for nozzles around the device area, and the symbol of the nozzle is small, so if it is recognized in the entire drawing area, there is a high probability of misrecognition. If extracted, the recognition rate of the nozzle is increased, and the recognition rate of all symbols can be improved.

도 4는 심볼을 인식하는 초기 단계로 png확장자명으로 이미지화된 P&ID에서 심볼을 하나씩 비교하는 것을 나타낸 예시도이다. PSD 심볼이 13개, DCS INSTRUMENT는 28개, LOCAL MOUNT INSTRUMENT는 64개 인식하여 추출하였고, 현재는 우측에 표시된 Insertion Blind Open 심볼(400)을 검색 중인 상태이다.4 is an exemplary diagram showing comparison of symbols one by one in a P&ID imaged with a png extension name as an initial step of recognizing a symbol. 13 PSD symbols, 28 DCS INSTRUMENTs, and 64 LOCAL MOUNT INSTRUMENTs were recognized and extracted. Currently, the Insertion Blind Open symbol 400 displayed on the right is being searched.

도 5는 이미지화된 P&ID 도면의 심볼을 제거한 후의 실시예를 도시한 것으로 도 2와 비교하였을 때 심볼이 제거된 것을 알 수 있다. 이를 통해 심볼을 인식하는데 있어서 인식속도를 높여주고, 라인을 인식할 때, 심볼을 라인으로 잘못 인식하는 경우를 방지한다는 것은 상기한 바와 같다.FIG. 5 shows an embodiment after removing the symbol of the imaged P&ID drawing, and it can be seen that the symbol is removed when compared with FIG. 2. As described above, this increases the recognition speed in recognizing a symbol and prevents the case of erroneously recognizing a symbol as a line when recognizing a line.

다음으로, 이미지화된 P&ID 도면에서 상기 제거된 심볼의 연결점에 연결되어 있는 라인을 슬라이딩 윈도우(Sliding Window) 방법을 이용하여 인식하여 추출한다(S220). 라인 인식전 상기 이미지화된 도면에서 Trim Line과 같은 작은 객체를 제거한다.Next, a line connected to the connection point of the removed symbol in the imaged P&ID drawing is recognized and extracted using a sliding window method (S220). Before line recognition, a small object such as a trim line is removed from the imaged drawing.

도 6은 상기 슬라이딩 윈도우 방법으로 라인을 인식하는 것을 나타낸 예시도이다. 슬라이딩 윈도우 방법이란 픽셀 단위가 아닌 Blob단위로 계산하여 라인을 인식하는 것으로 픽셀 단위로 인식하는 방식보다 인식에 걸리는 시간이 단축된다. 인식된 심볼(600)을 기준으로 심볼의 연결점에서 슬라이딩 윈도우(610)를 상하/좌우로 이동시켜 라인을 인식하고, 좌/우로 이동하다 선을 찾지 못하면 끝점에서 상/하로 라인을 찾도록 한다. 슬라이딩 윈도우에 일정 부분이상 점유되어도 라인으로 인식되기 때문에 픽셀이 떨어져 있어도 라인으로 인식이 가능하다. 슬라이딩 윈도우의 길이는 사용자가 임의로 조정할 수 있으므로 인식 정확도와 속도를 조절하여 사용할 수 있다.6 is an exemplary view showing recognition of a line by the sliding window method. The sliding window method recognizes a line by calculating it in blob units rather than pixel units, and the time it takes for recognition is shorter than that in pixel units. Based on the recognized symbol 600, the sliding window 610 is moved up/down/left/right at the connection point of the symbol to recognize the line, and if the line is not found while moving left/right, the line is searched up/down at the end point. Even if more than a certain portion of the sliding window is occupied, it is recognized as a line, so it is possible to recognize a line even if the pixels are separated. Since the length of the sliding window can be arbitrarily adjusted by the user, the recognition accuracy and speed can be adjusted and used.

상기 라인의 좌표를 추출할 때에 있어서 라인과 심볼이 연결되는 심볼의 연결점의 좌표와 라인의 끝점의 좌표는 이미지 상의 심볼과 라인의 두께로 인해 정확히 일치하지 않을 수 있다. 이 경우 픽셀 단위로 떨어져 있는 라인과 심볼을 심볼의 연결점 좌표로 미세조정해준다. 이는 본 발명의 데이터베이스를 이용하여 새로운 P&ID를 생성할 때에 정확히 라인과 심볼을 연결시키기 위해 필요하다. 해당 미세조정은 라인과 라인 사이에서도 동일한 방법으로 적용하여 수평선/수직선 자체 두께로 인해 중심이 연결되지 않을 때 이를 연결되도록 처리한다.When extracting the coordinates of the line, the coordinates of the connection point of the symbol to which the line and the symbol are connected and the coordinates of the end point of the line may not exactly match due to the thickness of the symbol and the line on the image. In this case, lines and symbols separated by pixel units are finely adjusted with the coordinates of the connection point of the symbol. This is necessary to accurately connect lines and symbols when creating a new P&ID using the database of the present invention. The fine adjustment is applied in the same way between the line and the line, so that when the center of the horizontal line/vertical line is not connected due to its own thickness, it is processed to be connected.

다음으로, 상기 심볼이 제거된 이미지화된 P&ID 도면에서 종횡비(Aspect Ratio)를 계산하여 텍스트가 존재하는 영역을 계산한 후 해당 영역에서텍스트를 인식하여 추출한다(S230). 텍스트를 인식하는 방법은 OCR(Optical Character Reader)과 같은 기존의 텍스트 인식 프로그램을 사용할 수 있다. OCR 구현원리는 공지된 것이므로 상세한 설명은 생략한다. OCR외의 공지된 텍스트 인식 방법을 사용할 수 있음은 당연하다. 라인, 심볼과 텍스트가 혼재하는 P&ID 도면의 특성 상 텍스트만 존재하는 일반 문서에 비해 텍스트 인식률이 떨어진다. 따라서 도면에서 문자 종횡비를 계산하여 텍스트가 존재하는 영역을 추출하여 해당 영역의 텍스트만 인식하는 방법이 필요하다.Next, an aspect ratio is calculated from the imaged P&ID drawing from which the symbol has been removed to calculate an area in which the text exists, and then the text is recognized and extracted in the corresponding area (S230). As a method of recognizing text, an existing text recognition program such as OCR (Optical Character Reader) can be used. Since the OCR implementation principle is known, detailed descriptions are omitted. It is natural that known text recognition methods other than OCR can be used. Due to the nature of P&ID drawings in which lines, symbols and text are mixed, the text recognition rate is lower than that of general documents containing only text. Therefore, there is a need for a method of calculating a character aspect ratio in a drawing, extracting an area where text exists, and recognizing only the text in the corresponding area.

도7은 상기 텍스트 영역을 계산하는 방법을 표현한 예시도이며, 영역을 계산하는 방법은 우선 외곽선을 추출하여 라인과 Instrument Bubble을 제거한다. 그 후 인식된 부분이 Bounding Box가 미리 설정해둔 종횡비를 벗어나면 그부분을 제거하고, 인식된 부분이 설정해둔 종횡비 범위 내라면 텍스트 영역으로 남겨둔다. 상기 텍스트 영역으로 인식된 부분을 미리 설정해둔 임계값까지 팽창(Dilate)시켜 다음 인식된 부분도 텍스트 영역이라고 판별된다면 남겨두는 방식으로 전체 텍스트 영역의 Contour Bounding Box를 생성하여 전체 텍스트 영역을 추출한다. 텍스트 영역을 설정하여 추출하는 이유는 해당 텍스트 영역을 설계 정보 단위로 인식하여 후행 단계인 텍스트에서 추출된 속성 정보를 분류하여 연계하는데 이용하기 위함이다. 7 is an exemplary diagram illustrating a method of calculating the text area, and in the method of calculating the area, first, an outline is extracted to remove lines and instrument bubbles. After that, if the recognized part is out of the aspect ratio set by the Bounding Box, the part is removed, and if the recognized part is within the set aspect ratio, it is left as a text area. The entire text area is extracted by creating a contour bounding box of the entire text area in a manner that dilates the part recognized as the text area to a preset threshold value, and if the next recognized part is also determined to be a text area, is left. The reason for setting and extracting the text area is to recognize the corresponding text area as a design information unit and use it to classify and link attribute information extracted from the text that is a later stage.

상기 방법으로 텍스트가 존재하는 영역을 추출하면 추출한 영역의 이미지에서 텍스트를 OCR을 적용하여 텍스트를 인식한다. 다만 현재 최고 수준의 OCR도 인식률이 100%가 되지 않아 오인식이나 미인식이 발생하므로 텍스트를 Training시켜 인식률을 높일 필요가 있다. 상기 텍스트를 인식하는 방법으로 인식하지 못한 경우, 먼저 해당 텍스트 이미지를 저장하고 각각의 이미지에서 문자를 매핑한다. 문자를 매핑하는 방법으로는 상기 이미지의 문자와 가장 유사한 문자를 매핑하거나 사용자가 일일이 해당되는 문자를 지정하는 방법을 사용할 수 있다. 그 후 매핑 데이터를 이용하여 Training Data를 생성하고 생성한 Training Data를 데이터베이스화하여 텍스트 인식에 적용한다.When the area where the text exists is extracted by the above method, the text is recognized by applying OCR to the text in the image of the extracted area. However, since the recognition rate of the current highest level of OCR is not 100%, misrecognition or non-recognition occurs, so it is necessary to increase the recognition rate by training text. If the text is not recognized by the method of recognizing the text, the text image is first stored and characters are mapped in each image. As a method of mapping characters, a method of mapping a character most similar to the character of the image or a method in which a user individually designates a corresponding character may be used. After that, training data is created using the mapping data, and the generated training data is converted into a database and applied to text recognition.

다음으로, 추출된 텍스트들 중 Drawing 영역에서 검출되는 텍스트는 사전 정의된 속성 분류 체계를 통해 각각의 속성으로 분류한다(S240). 상기 Drawing 영역은 도면의 설명부가 아닌 심볼과 라인의 집합으로 이루어진 부분으로, 영역을 구분한 이유는 분류하여야 할 속성이 다를 수 있기 때문이다. 텍스트가 Drawing 영역 외의 Note, Revision Data, Title Block, Description영역에서 검출되는 경우, 각 인식할 요소의 영역을 설정 후 검출되는 텍스트가 영역에 포함되는지 여부를 확인하여 인식 요소로 판별한다.Next, among the extracted texts, the text detected in the drawing area is classified into each attribute through a predefined attribute classification system (S240). The drawing area is a part consisting of a set of symbols and lines, not a description part of the drawing, and the reason for classifying the area is that attributes to be classified may be different. When text is detected in the Note, Revision Data, Title Block, or Description area other than the drawing area, it is determined as a recognition element by checking whether the detected text is included in the area after setting the area of each element to be recognized.

Drawing 영역에서 검출되는 텍스트의 속성은 Line Number, Size, Tag Number, Instrumnet Type, Serial Number, Serial Number, P&ID Name 등으로 나뉜다. 해당 속성은 사용자에 따라 임의로 지정할 수 있다. Line Number는 사업주 지정 양식을 따르는데 Size, Fluid, Serial Number, Insulation 등의 구분자로 조합된다. Size는 Size를 구성하는 텍스트는 숫자,특수문자(/,") 등의 조합으로 이루어져 있으며, Tag Number는 알파벳, 숫자, 특수문자(-,/,")등의 조합으로 이루어져 있고, Instrument Type은 프로젝트에 명시된다.Text properties detected in the drawing area are divided into Line Number, Size, Tag Number, Instrumnet Type, Serial Number, Serial Number, and P&ID Name. The property can be arbitrarily specified according to the user. Line Number follows the business owner designation form, and is combined with the separators such as Size, Fluid, Serial Number, and Insulation. Size is composed of a combination of numbers and special characters (/,"), and the tag number consists of combinations of alphabets, numbers, and special characters (-,/,"), and Instrument Type is It is specified in the project.

도 7는 라인 넘버를 추출하여 속성 정보를 분류하는 실시예로서, Fluid(700), Unit(710), Sequence(720), Material(730), Size(740), Insulation(750)등의 속성정보로 분류된다.7 is an embodiment of classifying attribute information by extracting a line number, and attribute information such as Fluid (700), Unit (710), Sequence (720), Material (730), Size (740), Insulation (750), etc. It is classified as

다음으로, 각 심볼이 가질 수 있는 속성 정보를 사전 정의하고 인식된 심볼과 라인이 가지는 속성 타입에 맞는 속성들을 도면에서 찾아 가장 가까운 속성과 연결시킨다(S250). 속성들을 연계시킴으로써, 견적을 산출할 때 필요한 장치와 수량을 파악할 수 있고, 향후 새로운 P&ID 제작할 때 심볼 및 라인을 모델링하고 상기 심볼 및 라인을 설명하는 텍스트를 현출할 수 있다. 속성 정보를 사전에 정의함으로써, 심볼에 불필요하거나 잘못된 속성을 연계하는 오류를 방지할 수 있다. 추출된 심볼, 라인을 가장 인접한 거리 기준으로 상기 추출된 텍스트의 속성과 연계시키고, 상기 추출된 심볼이 장치(equipment)인 경우, 상기 텍스트에서 인식된 장치 이름을 기준으로 연계시킨다. 장치의 경우, 장치의 이름은 Description 영역에 있는 Description을 사용한다. Next, attribute information that each symbol may have is defined in advance, and attributes suitable for the attribute type of the recognized symbol and line are found in the drawing and connected with the nearest attribute (S250). By linking the attributes, it is possible to grasp the devices and quantities required when calculating an estimate, model symbols and lines when creating a new P&ID in the future, and display text describing the symbols and lines. By defining attribute information in advance, it is possible to prevent an error in linking unnecessary or incorrect attributes to a symbol. The extracted symbol and line are associated with the attribute of the extracted text based on the nearest distance, and when the extracted symbol is an equipment, the extracted symbol and line are linked based on the device name recognized in the text. In the case of a device, the device name uses the Description in the Description field.

심볼 및 라인 속성은 Process/Utility line, Reducer, Equipment, Nozzle, Instrument, OPC로 나눌 수 있다. 해당 속성은 사용자에 따라 임의로 지정할 수 있다. Process/Utility line은 텍스트열에서 분류한 속성 중 Line Number, Reducer은 Main Size x Sub Size, Equipment과 Nozzle은 Tag Number, Instrument는 Type, Serial Number, OPC는 P&ID Number을 사용한다.Symbol and line properties can be divided into Process/Utility line, Reducer, Equipment, Nozzle, Instrument, and OPC. The property can be arbitrarily specified according to the user. Process/Utility line uses Line Number among the attributes classified in text string, Main Size x Sub Size for Reducer, Tag Number for Equipment and Nozzle, Type, Serial Number for Instrument, and P&ID Number for OPC.

도 8은 추출된 심볼과 라인을 연계하는 단계에 대해 설명하기 위한 실시예로 800은 심볼 중 장치(Equipment)로 Description영역에 있는 "E-234-009"가 장치이름으로 해당 장치와 연계된 것을 나타낸 것이다. 810은 라인으로 인식된 라인과 가장 가까운 "P-234-03303-CD3D-2"-N", "P-234-03305-CD3D-3"-N"가 Line Number로 연계된 것을 표현한 것이며, 820은 Pressure Safety Valve의 심볼로 "PSV 2907"가 Instrument의 Type과 Serial Number로 연계된 것을 묘사한 것이다.8 is an embodiment for explaining the step of associating the extracted symbol with the line. In 800, “E-234-009” in the Description area as a device among symbols is associated with a corresponding device as a device name. Is shown. 810 represents that "P-234-03303-CD3D-2"-N", "P-234-03305-CD3D-3"-N", which are closest to the line recognized as a line, are linked by Line Number, and 820 Is a symbol of Pressure Safety Valve that describes the connection of "PSV 2907" to the instrument type and serial number.

본 발명을 통해 속성 정보와 연계된 설계 정보를 바탕으로 자재명세서(BOM)을 작성하거나 Equipment, Instrument등을 이용하여 FEED 과정에서 필요한 설계 견적을 자동으로 산출할 수 있다. 다만 추가로 객체와 개체 통합에 의한 토폴로지 생성을 통해 XML 형식의 중간 파일을 생성하여, 추후에 자동으로 P&ID을 작성하는데 활용할 수 있다. Through the present invention, a bill of materials (BOM) can be created based on design information linked to attribute information, or a design estimate required in the FEED process can be automatically calculated using equipment, instruments, etc. However, it can be used to automatically create a P&ID later by creating an intermediate file in XML format through topology creation by integrating objects and objects.

도 9는 설계정보를 연계시키는 단계에 이어 재귀호출 기법을 통하여 라인과 라인에 연결된 심볼을 연결하고 연결된 심볼을 라인의 From-to 순서대로 재배열하여 객체와 객체 통합에 의해 토폴로지 생성하는 것을 도식화한 것이다. 먼저 인식 객체간의 연계 관계를 다음과 같이 정의한다. Process/Utility Line은 장치에서 시작해서 장치에서 끝나는 경우, 라인에서 시작해서 장치에서 끝나는 경우, 라인에서 시작해서 라인에서 끝나는 경우가 있다. 그 외에 장치는 그에 종속되는 노즐(Nozzle)을 갖으며, P&ID 도면은 다른 P&ID와 연계성을 가진다.FIG. 9 is a schematic diagram of creating a topology by integrating objects and objects by connecting symbols connected to lines and lines through a recursive calling technique following the step of linking design information and rearranging the connected symbols in from-to order of lines. will be. First, the linkage relationship between recognized objects is defined as follows. Process/Utility Line may start at the device and end at the device, start at the line and end at the device, or start at the line and end at the line. In addition, the device has a nozzle dependent thereon, and the P&ID drawing has a linkage with other P&IDs.

상기 재귀호출 기법(recursive algorithm)은 임의의 함수가 자신을 호출하는 것을 의미하며, 본 발명에서는 임의의 심볼 또는 라인이 상기 저장된 연결정보를 이용하여 연결된 다른 심볼 또는 라인을 호출하고, 상기 과정이 계속적으로 반복되어 라인과 라인에 연결된 심볼을 연결하여 From-to 순서대로 재배열한다.The recursive algorithm means that an arbitrary function calls itself, and in the present invention, an arbitrary symbol or line calls another symbol or line connected by using the stored connection information, and the process continues. Repeatedly, the line and the symbol connected to the line are connected and rearranged in the from-to order.

토폴로지를 생성하는 방법은 우선 라인과 라인에 연결되어있는 각 심볼을 연결하고 연결된 심볼을 라인의 Flow mark에 따라 재배열한다. 라인과 라인에 연결되어있는 각 심볼을 연결하는 과정에서 연결이 끊어진 경우 중심선 기준으로 좌표 조정을 하여 연결성을 확보하는 작업이 필요하다. Flow mark에 따라 From-to 순서대로 정렬되는 방법은 라인 리스트의 From 또는 to와 연결되어 있는 라인을 시작점으로 하여 From-to를 연결하는 라인 혹은 객체를 찾아 순서대로 정렬하였다. 도 9에서 Flow mark 방향으로 정렬하기 전에는 상기 심볼을 인식하는 과정에서 2번과 3번의 동일한 심볼이 연속하여 추출되는 등 심볼이 순서 없이 정렬되어 토폴로지로 형성할 수 없었으나, Flow mark 방향으로 정렬함에 따라 좌측에서 우측으로 순서대로 정렬되었다. From-to 순서대로 토폴로지 생성을 하지 않는다면, 본 발명의 방법으로 생성된 데이터베이스를 바탕으로 새로운 P&ID 작성시 좌표점으로만 무작위로 심볼을 모델링하여야 하는데, 이로 인해 상기 이미지화된 P&ID 도면과 다른 도면이 생성될 수 있다. 따라서 새로운 P&ID 도면을 작성할때, 수작업으로 이루어지던 설계 도면 작성 작업을 자동으로 하기 위해서 토폴로지 생성은 필요하다. To create a topology, first connect a line and each symbol connected to the line, and rearrange the connected symbols according to the flow mark of the line. If the connection is broken in the process of connecting the line and each symbol connected to the line, it is necessary to adjust the coordinates based on the center line to secure connectivity. The method of sorting in from-to order according to the flow mark is to find the line or object that connects From-to with the line connected to from or to in the line list as a starting point, and sort them in order. In FIG. 9, before alignment in the flow mark direction, the same symbols 2 and 3 were successively extracted during the process of recognizing the symbol, so that the symbols were arranged in no order to form a topology. They were arranged in order from left to right. If the topology is not generated in the from-to order, when creating a new P&ID based on the database created by the method of the present invention, a symbol must be modeled at random only with coordinate points, which results in a drawing different from the imaged P&ID drawing. Can be. Therefore, when creating a new P&ID drawing, it is necessary to create a topology to automatically create a design drawing that was done manually.

라인 추출시에는 Process line과 Utility line을 구별하지 않았으나 토폴로지 생성과정에서 이를 구별하는 것을 특징으로 할 수 도 있다. 라인을 구별하는데 있어서 원래 오기된 부분이 있거나 인식이 잘못된 부분이 있을 수 있는데 객체를 연결하는 과정에서 Process line과 Utility line에 연결되는 객체에 따라 구별하는게 가장 정확하기 때문이다. 연결되는 객체가 Instrument인 경우에 해당 라인은 Utility line이고 그 외의 경우는 Process line으로 분류한다.The process line and the utility line are not distinguished when extracting lines, but they can be distinguished during the topology generation process. In distinguishing lines, there may be parts that are originally incorrect or parts that are not recognized because it is most accurate to distinguish according to the objects connected to the process line and utility line in the process of connecting objects. When the connected object is an instrument, the line is classified as a utility line, and in other cases, it is classified as a process line.

장치를 연결하는 과정에서 노즐이 장치에 붙어있는 경우도 있고 형상 내에 포함되어 있는 경우가 있어 장치 형상과 겹치는 노즐을 찾아 장치에 연결시킨다. 이는 노즐이 일반적으로 장치 근처에서 작은 형상으로 있기 때문이다.In the process of connecting the device, the nozzle may be attached to the device or may be included in the shape, so a nozzle that overlaps the device shape is found and connected to the device. This is because the nozzles are generally small in shape near the device.

OPC에 나타나는 P&ID Number을 이용하여 다른 P&ID와 연결시는 과정을 포함시킬 수 있다. OPC에 나타는 P&ID 이름과 실제 파일 이름이 서로 다른 경우 이 둘 의 관계를 정립하는 설정이 필요하다. 또한 하나의 P&ID에 연결되는 P&ID가 여러 개 있을 수 있으므로 P&ID에 연결되는 P&ID 정보를 저장하여 다른 페이지에 있는 P&ID 정보를 쉽게 획득할 수 있도록 한다.The process of connecting with other P&IDs can be included by using the P&ID Number displayed on the OPC. If the P&ID name displayed on the OPC and the actual file name are different, it is necessary to establish a relationship between the two. In addition, since there may be multiple P&IDs connected to one P&ID, the P&ID information connected to the P&ID is stored so that P&ID information on other pages can be easily obtained.

다음으로, 추출된 설계정보 및 토폴로지를 호환 가능한 XML 형식의 중간 파일로 생성하는 것을 특징으로 할 수 도 있다. 도 10은 중간파일로 생성된 객체 간의 연계를 통해 구축한 토폴로지의 Hierarchy 구조의 예시도이다. Equipment에 대해서는 장치 번호가 저장되고 상기 저장된 장치 번호에 "p1,p2,v1,v2"의 순서대로 저장된 노즐이 연계된 것을 도식화하였다. 이와 같은 방법으로 라인 넘버가 "8"-DMW-UW10029-AR5W-HC"인 라인에 심볼이 "FOO41, Valve, Reducer, Foo41" 순서대로 연계된 것을 보여준다.Next, it may be characterized by generating the extracted design information and topology as an intermediate file in a compatible XML format. 10 is an exemplary diagram of a hierarchy structure of a topology constructed through linkage between objects created as intermediate files. For Equipment, a device number is stored, and a nozzle stored in the order of "p1, p2, v1, v2" is linked to the stored device number. In this way, it shows that the symbols are connected in the order of "FOO41, Valve, Reducer, Foo41" in the line with the line number "8"-DMW-UW10029-AR5W-HC".

Claims

As a method of automatically recognizing and classifying design information in imaged P&ID drawings,
Extracting a symbol area from the imaged P&ID drawing, setting the origin and connection point of a corresponding symbol in the symbol area, and automatically registering the symbol in a database;
Recognizing and extracting a symbol automatically registered in the database from the imaged P&ID drawing, and recognizing and extracting the extracted symbol from the imaged P&ID drawing;
Recognizing and extracting a line from the imaged P&ID drawing from which the symbol has been removed using a sliding window method that calculates a blob unit instead of a pixel unit;
Calculating an aspect ratio from the imaged P&ID drawing from which the symbols have been removed to calculate an area where the text exists, and then recognizing and extracting the text from the corresponding area;
Classifying the text detected in the drawing area among the extracted texts into respective attributes through a predefined attribute classification system;
Associating the extracted symbol and line with the attribute of the extracted text based on the nearest distance, and associating the extracted symbol based on the device name recognized in the text when the extracted symbol is an equipment. How to automatically recognize and classify design information in imaged P&ID drawings

The method according to claim 1,
In the registering of the symbol, when a symbol is formed as a set of a plurality of additional symbols, registering as one symbol including the additional symbols forming the set,
In the step of recognizing and extracting the symbol, the registered symbol including the additional symbol is first inspected and recognized. A method of automatically recognizing and classifying design information in an imaged P&ID drawing.

The method according to claim 1,
In the step of recognizing and extracting the symbol, design information is automatically extracted from the imaged P&ID drawing, characterized in that the symbol automatically registered in the database is rotated by 0, 90, 180, and 270 degrees to recognize and extract each symbol from the drawing. How to recognize and classify

The method according to claim 1,
In the step of recognizing and extracting the symbol, design information is automatically recognized and classified in an imaged P&ID drawing, characterized in that the device (equipment) among the symbols is first checked, and a nozzle is found and extracted around the device area. How to

The method according to claim 1,
The step of recognizing and extracting the symbol includes comparing the recognized symbol with the feature points of the symbol registered in the symbol registration step, and recognizing the symbol as a registered symbol only when the matching degree of the recognized symbol is higher than a set threshold. How to automatically recognize and classify design information in imaged P&ID drawings

The method according to claim 1,
In the step of recognizing and extracting the line, when the extracted symbol and the line are connected but the coordinates are different, or the line and the line are connected but the coordinates are different, the coordinates are corrected for each pixel to extract the line. How to automatically recognize and classify design information in imaged P&ID drawings

The method according to claim 1,
The step of recognizing and extracting the text comprises recognizing the text as OCR, and when the text is not correctly recognized, the imaged P&ID, characterized in that for training OCR by accurately mapping the text when the text is not correctly recognized. How to automatically recognize and classify design information in drawings

The method according to claim 1,
The linking of the extracted design information further comprises generating a topology by connecting a line and a symbol connected to the line through a recursive calling technique, and rearranging the connected symbols in from-to order of the line. How to automatically recognize and classify design information in imaged P&ID drawings

The method of claim 8,
A method for automatically recognizing and classifying design information in an imaged P&ID drawing, characterized in that generating the extracted design information and topology as a compatible XML format file

The method of claim 8,
In the step of generating the topology, if the object connected to the line is an instrument among symbols, it is classified as a utility line, and otherwise, as a process line, design information is automatically recognized and classified in an imaged P&ID drawing.

The method of claim 8,
The step of generating the topology comprises automatically recognizing and classifying design information in an imaged P&ID drawing, characterized in that the extracted text is connected to another P&ID by using a P&ID No appearing on an OPC (Operating page connection).