A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer
<p>Image of a polyacrylamide protein gel. The vertical columns or lanes represent different experiments placed within the gel (numbered 1 to 15). The horizontal lines or bands represent the proteins identified per column.</p> "> Figure 2
<p>(<b>A</b>) Purification of recombinant GPN protein from <span class="html-italic">Escherichia coli</span> bacteria. Lane 1: Molecular weight control. Lane 2: Positive control of GPN protein expression. Lane 3: Negative expression control. Lane 4: Total protein extract. Lanes 5–9: purified GPN protein. (<b>B</b>) Different concentrations of purified GPN protein. (<b>C</b>) Different concentrations of BSA protein.</p> "> Figure 3
<p>Preprocessing and feature extraction to analyze the complete gel image.</p> "> Figure 4
<p>(<b>A</b>) SDS-PAGE gel containing GPN protein expressed at different concentrations. Lane 1, molecular weight control; lanes 2–11, GPN at the following concentrations: 2.0, 0.0, 14.5, 9.5, 27, 18, 30, 18.5, 26, and 14 µg/mL, respectively. (<b>B</b>) The intensity profile of lane 1 of (<b>A</b>) (weight control). (<b>C</b>) The intensity profile of lane 4 for (<b>A</b>). (<b>D</b>) The intensity profile of the bands for the recombinant GPN protein region, highlighted in red from the gel in (<b>A</b>).</p> "> Figure 5
<p>Preprocessing of SDS-PAGE gel images. (<b>A</b>) Grayscale image. (<b>B</b>) Binarized image. (<b>C</b>) Eroded image.</p> "> Figure 6
<p>IPBBIS. (<b>A</b>) Binarized and eroded image. (<b>B</b>) Representation of the binary mask with a size of 1 × 400 pixels placed at pixel 7. (<b>C</b>) A simple histogram of the region contained in the binary mask, (<b>D</b>) Histogram of the region after applying Otsu segmentation to (<b>C</b>).</p> "> Figure 7
<p>A new image intensity profile was obtained by plotting the array’s values containing only the white pixels generated by the IPBBIS method.</p> "> Figure 8
<p>(<b>A</b>) A complete gel is shown. (<b>B</b>) A plot was generated using the image profile based on binarized image segmentation after image equalization. (<b>C</b>) The IPBBIS plot was obtained from the non-equalized <a href="#algorithms-17-00149-f006" class="html-fig">Figure 6</a>A.</p> "> Figure 9
<p>(<b>A</b>) SDS-PAGE gel of BSA protein with the concentrations 2 mg/mL (lane 1), 1 mg/mL (lane 2), and 0.5 mg/mL (lane 3). (<b>B</b>) Image of binary, dilated, segmented, and eroded (<b>A</b>). (<b>C</b>) Plot of the new intensity profile of the (<b>A</b>) gel using IPBBIS.</p> "> Figure 10
<p>(<b>A</b>) Initial image of the gel with GPN samples at different concentrations before preprocessing. (<b>B</b>) A plot was generated using the IPBBIS method applied to (<b>A</b>).</p> "> Figure 11
<p>(<b>A</b>) Molecular weight control was obtained from lane one or the ladder in <a href="#algorithms-17-00149-f010" class="html-fig">Figure 10</a>A. (<b>B</b>) The IPBBSI plot applied to (<b>A</b>). (<b>C</b>) Bands automatically detected by IPBBIS marked with blue lines.</p> "> Figure 12
<p>Results of the interpolation methods for the molecular weight of the detected proteins.</p> "> Figure 13
<p>(<b>A</b>) Detected maxima (blue dash lines). (<b>B</b>) Threshold that allows obtaining a cut-off region that includes only the minima that represent the separation of the samples. (<b>C</b>) Graph obtained from the cut-off region. (<b>D</b>) Total of automatically detected samples (blue dash lines).</p> "> Figure 14
<p>(<b>A</b>) Random sample selection within the gel (red box in the image). (<b>B</b>) Automatic band detection using image profiling based on binarized image segmentation. (<b>C</b>) Molecular weight detection using IPBBIS for GPN protein. (<b>D</b>) Selection of GPN protein bands from different samples for ROI. (<b>E</b>) The application of the image profile is carried out in the binarized image segmentation of (<b>D</b>).</p> "> Figure 15
<p>Image profiling of <a href="#algorithms-17-00149-f014" class="html-fig">Figure 14</a>D, based on binarized image segmentation in the region of interest. (<b>A</b>) lane 2, (<b>B</b>) lane 4, (<b>C</b>) lane 5, (<b>D</b>) lane 6, (<b>E</b>) lane 7, (<b>F</b>) lane 8, (<b>G</b>) lane 9, (<b>H</b>) lane 10, and (<b>I</b>) lane 11.</p> "> Figure 16
<p>(<b>A</b>) Manual calculation of the band area by outlining the spot contour. (<b>B</b>) Area calculated by K-means segmentation. (<b>C</b>) Area calculated by Otsu segmentation.</p> ">
Abstract
:1. Introduction
1.1. Novelty
1.2. Limitations and Challenges
2. Materials and Methods
2.1. Creation of Samples with Different GPN Concentrations
2.2. Image Acquisition
2.3. Preprocessing and Feature Extraction
Algorithm 1. Pseudocode to find the number of lanes and bands in the polyacrylamide gel image. | |
Algorithm for band and lane detection | |
1: | Resize the image to 600 × 400 px for light processing |
2: | if Excess_of_protein: |
3: | Histogram equalization |
4: | end if |
5: | Obtain a binarized Image |
6: | Image dilation |
7: | Image invert |
8: | Image erosion |
9: | Column = 1 |
10: | If Lane detection: |
11: | MAXWIDE = 400 px |
12: | else: # band detection |
13: | MAXWIDE = 50 px |
14: | end else |
15: | end if |
16: | Apply Binary Mask on the resized image |
17: | Initialize Array to zero |
18: | while Column ‹= MAXWIDE: |
19: | Get the Histogram_of_image |
20: | Otsu_Segmentation_Applied_to_Binary_Mask_Size_zone |
21: | Get the number of white pixels in the Histogram of the segmented region, Histogram[white_position] # get the quantity of white color in the histogram binarized |
22: | Array [Column] = Number_White_Pixels_Histogram [255] |
23: | Column++ |
24: | end while |
25: | Plott Array |
26: | if Lane_Analysis: |
27: | Multiple_Minimum_correlate_Lane_Separation(Array) |
28: | Multiple_Maximus_related_Band_Separation(Array) |
29: 30: | else#band_analysis Average_Multiples_Maximums_between_separations_To_Get_One_Maximum |
31: | end else |
32: | end if |
3. Results and Discussion
3.1. Traditional Analysis of GPN Protein Gels at Different Concentrations
3.2. Preprocessing and Feature Extraction Using the Proposed Algorithm
3.3. Detection of Protein Overexpression in Gels Using the IPBBIS Algorithm
3.3.1. Application of the New Intensity Profile on the Complete Gel Image
3.3.2. Application of the IPBBIS Method on a Sample with Controlled Concentrations
3.3.3. Effectiveness of the IPBBIS Method Using Known Concentrations
3.3.4. Elimination of Impurities through the Determination of the Molecular Weight of the Target Protein
3.3.5. Choice of Threshold for the Elimination of Multiple Maximums
3.3.6. Analysis of the Region of Interest Using the IPBBIS Methods, Manual Area Calculation, Area Calculation by K-Means Segmentation, and Area Calculation by Otsu Segmentation
3.3.7. IPBBIS Study on the Image Dataset Using the Confusion Matrix
3.3.8. Functionality of the Methods Analyzed to Find GPN Protein Overexpression: IPBBIS, Manual Area Calculation, Area Calculation by K-Means Segmentation, and Area Calculation by Otsu Segmentation
4. Conclusions
- Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kaabouch, N.; Schultz, R.R.; Milavetz, B. An analysis system for DNA Gel Electrophoresis images based on automatic thresholding and enhancement. In Proceedings of the 2007 IEEE International Conference on Electro/Information Technology, Chicago, IL, USA, 17–20 May 2007; pp. 1–6. [Google Scholar]
- Ferrari, M.; Cremonesi, L.; Carrera, P.; Bonini, P. Diagnosis of genetic disease by DNA technology. Pure Appl. Chem. 1991, 63, 1089–1096. [Google Scholar] [CrossRef]
- Goez, M.M.; Torres-Madroñero, M.C.; Röthlisberger, S.; Delgado-Trejo, E. Preprocessing of 2-Dimensional Gel Electrophoresis Images Applied to Proteomic Analysis: A Review. Genom. Proteom. Bioinform. 2018, 16, 63–72. [Google Scholar] [CrossRef] [PubMed]
- Intarapanich, A.; Kaewkamnerd, S.; Shaw, P.J.; Ukosakit, K.; Tragoonrung, S.; Tongsima, S. Automatic DNA diagnosis for 1D Gel Electrophoresis Images using Bio-image Processing Technique. BMC Genom. 2015, 16, S15. [Google Scholar] [CrossRef] [PubMed]
- Jian-Derr, L.; Chung-Hsien, H.; Neng-Wei, W.; Chen-Song, L. Automatic DNA sequencing for electrophoresis gels using image processing algorithms. J. Biomed. Sci. Eng. 2011, 4, 523–528. [Google Scholar]
- Taher, R.S.; Jamil, N.; Nordin, S.; Bahari, U.M. A new false peak elimination method for poor DNA gel images analysis. In Proceedings of the 2014 14th International Conference on Intelligent Systems Design and Applications, Okinawa, Japan, 28–30 November 2014; pp. 180–186. [Google Scholar]
- Koprowski, R.; Wróbel, Z.; Korzynska, A.; Chwialkowska, K.; Kwasniewski, M. Automatic analysis of 2D polyacrylamide gels in the diagnosis of DNA polymorphisms. Biomed. Eng. 2013, 12, 68. [Google Scholar] [CrossRef] [PubMed]
- Cai, F.; Liu, S.; Dijke, P.T.; Verbeek, F.J. Image analysis and pattern extraction of proteins classes from one-dimensional gels electrophoresis. Int. J. Biosci. Biochem. Bioinform. 2017, 7, 201–212. [Google Scholar] [CrossRef]
- Ahmed, N.E. EgyGene GelAnalyzer4: A powerful image analysis software for one-dimensional gel electrophoresis. J. Genet. Eng. Biotechnol. 2021, 19, 18. [Google Scholar] [CrossRef]
- Alnamoly, M.H.; Alzohairy, A.M.; Mahmoud, I.; El-Henawy, I.M. EGBIOIMAGE: A software tool for gel images analysis and hierarchical clustering. IEEE Access 2019, 8, 10768–10781. [Google Scholar] [CrossRef]
- Juárez, J.; Guevara-Villa, M.; Sánchez-Sánchez, A.; Díaz-Hernández, R.; Altamirano-Robles, L. Tridimensional structure prediction and purification of human protein GPN2 to high concentrations by nickel affinity chromatography in presence of amino acids for improving impurities elimination. In Transactions on Computational Science & Computational Intelligence; Springer Nature: Cham, Switzerland, 2021. [Google Scholar]
- Abadi, M.F. Processing of DNA and Protein Electrophoresis Gels by Image Processing. Sci. J. 2015, 36, 3486–3494. [Google Scholar]
- Abeykoon, A.; Dhanapala, M.; Yapa, R.; Sooriyapathirana, S. An automated system for analyzing agarose and polyacrylamide gel images. Ceylon J. Sci. 2015, 44, 45–54. [Google Scholar] [CrossRef]
- Bajla, I.; Holländer, I.; Fluch, S.; Burg, K.; Kollár, M. An alternative method for electrophoresis gel image analysis in the GelMaster software. Comput. Methods Programs Biomed. 2005, 77, 209–231. [Google Scholar] [CrossRef] [PubMed]
- Brauner, J.M.; Groemer, T.W.; Stroebel, A.; Grosse-Holz, S.; Oberstein, T.; Wiltfeang, J.; Maler, J.M. Spot quantification in two-dimensional gel electrophoresis image analysis: Comparison of different approaches and presentation of a novel compound fitting algorithm. Bioinformatics 2014, 15, 181. [Google Scholar] [CrossRef] [PubMed]
- Efrat, A.; Hoffmann, F.; Kriegel, K.; Schultz, C.; Wenk, C. Geometric algorithms for the analysis of 2D-Electrophoresis gels. J. Comput. Biol. 2002, 9, 299–315. [Google Scholar] [CrossRef]
- Faisal, M.; Vasiljevic, T.; Donkor, O.N. A review on methodologies for extraction, identification and quantification of allergenic proteins in prawns. Food Res. Int. 2019, 121, 307–318. [Google Scholar] [CrossRef]
- Fernández-Lozano, C.; Seoane, J.A.; Gestal, M.; Gaunt, T.R.; Dorado, J.; Pazos, A.; Campbell, C. Texture analysis in gel electrophoresis images using an integrative kernel-based approach. Sci. Rep. 2016, 6, 19256. [Google Scholar] [CrossRef] [PubMed]
- Kaur, N.; Sharma, P.; Jaimni, S.; Kehinde, B.A.; Kaur, S. Recent developments in purification techniques and industrial applications for whey valorization: A review. Chem. Eng. Commun. 2019, 207, 123–138. [Google Scholar] [CrossRef]
- Labyed, N.; Kaabouch, N.; Schultz, R.R.; Singh, B.B. Automatic segmentation and band detection of protein images based on the standard deviation profile and its derivative. In Proceedings of the 2007 IEEE International Conference on Electro/Information Technology, Chicago, IL, USA, 17–20 May 2007; pp. 577–582. [Google Scholar]
- Ramaswamy, G.; Wu, B.; MacEvilly, U. Knowledge management of 1D SDS PAGE Gel protein image information. J. Digit. Inf. Manag. 2010, 8, 223–232. [Google Scholar]
- Rezaei, M.; Amiri, M.; Mohajery, P. A new algorithm for lane detection and tracking on pulsed field gel electrophoresis images. Chemom. Intell. Lab. Syst. 2016, 157, 1–6. [Google Scholar] [CrossRef]
- Viswanathan, S.; Ünlü, M.; Minden, J. Two-dimensional difference gel electrophoresis. Nat. Protoc. 2006, 1, 1351–1358. [Google Scholar] [CrossRef]
- Heras, J.; Domínguez, C.; Mata, E.; Pascual, V.; Lozano, C.; Torres, C.; Zarazaga, M. GelJ—A tool for analyzing DNA fingerprint gel images. BMC Bioinform. 2015, 16, 270. [Google Scholar] [CrossRef]
- Alawdi, R.M.; Amer RB, M.; Alzohairy, A.M.; Khedr, W.M. The Computational Techniques Developed to Analyze DNA Gel Images. Int. J. Adv. Eng. Res. Sci. 2016, 3, 139–149. [Google Scholar]
- Heras, J.; Domínguez, C.; Mata, E.; Pascual, V.; Lozano, C.; Torres, C.; Zarazaga, M. A survey of tools for analysing DNA fingerprints. Brief. Bioinform. 2015, 17, 903–911. [Google Scholar] [CrossRef] [PubMed]
- Pavel, A.B.; Vasile, C.I. PyElph-a software tool for gel images analysis and phylogenetics. BMC Bioinform. 2012, 13, 9. [Google Scholar] [CrossRef] [PubMed]
- Khakabimamaghani, S.; Najafi, A.; Ranjbar, R.; Raam, M. GelClust: A software tool for gel electrophoresis images analysis and dendrogram generation. Comput. Methods Programs Biomed. 2013, 111, 512–518. [Google Scholar] [CrossRef] [PubMed]
- Alnamoly, M.H.; Alzohairy, A.M.; El-Henawy, I.M. A survey on gel image analysis software tools. J. Intell. Syst. Internet Things 2020, 1, 40–47. [Google Scholar]
- Juárez-Lucero, J.; Guevara-Villa, M.G.; Sánchez-Sánchez, A.; Díaz-Hernández, R.; Altamirano-Robles, L. Development of a Methodology to Adapt an Equilibrium Buffer/Wash Applied to the Purification of hGPN2 Protein Expressed in Escherichia coli Using an IMAC Immobilized Metal Affinity Chromatography System. Separations 2022, 9, 164. [Google Scholar] [CrossRef]
- Lara-Chacón, B.; Guerrero-Rodríguez, S.L.; Ramírez-Hernández, K.J.; Robledo-Rivera, A.Y.; Velazquez MA, V.; Sánchez-Olea, R.; Calera, M.R. Gpn3 is essential for cell proliferation of breast cancer cells independent of their malignancy degree. Technol. Cancer Res. Treat. 2019, 18. [Google Scholar] [CrossRef]
- Juárez, J.; Guevara-Villa MD, R.; Sánchez, A.; Díaz, R.; Altamirano, L. Image Segmentation Applied to Line Separation and Determination of GPN2 Protein Overexpression for Its Detection in Polyacrylamide Gels. In Progress in Artificial Intelligence and Pattern Recognition; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2021; pp. 303–315. [Google Scholar]
Lane | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Value | 79 | 71 | 58 | 40 | 23 | 29 | 45 | 61 | 41 | 45 | 41 |
Lane | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Concentration mg/mL | 2.0 | 0.0 | 14.5 | 9.5 | 27 | 18 | 30 | 18.5 | 26 | 14 |
Gel |
Interpolation Method | Calculated Weight (kDa) | Total Error % |
---|---|---|
Linear | 33.4 | 3.35648148 |
Nearest | 37.0 | 7.060185185 |
Cubic | 31.38 | 9.194960019 |
Lane | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Concentration mg/mL | 2.0 | 0.0 | 14.5 | 9.5 | 27 | 18 | 30 | 18.5 | 26 | 14 |
ROI-GPN | 10.38 | 0.0 | 20.55 | 18.67 | 27.34 | 22.87 | 28.82 | 23.08 | 27.26 | 20.11 |
Area Manual | 513.62 | 0.0 | 830.25 | 680.12 | 1373.80 | 830.00 | 1061.50 | 934.50 | 983.50 | 869.25 |
Area K-means segmentation | 495.50 | 0.0 | 969.62 | 933.00 | 1457.20 | 1132.20 | 1336.20 | 1134.90 | 1422.10 | 993.13 |
Area Otsu segmentation | 535.75 | 0.0 | 1189.8 | 1023.4 | 1646.8 | 1318.2 | 1557.1 | 1269.1 | 1585.9 | 1174.6 |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Real | Positive | TP = 310 | FN = 8 |
Negative | FP = 2 | TN = 349 |
Accuracy of Homogeneous Gels | Accuracy of Heterogeneous Gels |
---|---|
0.985052 | 0.91736 |
Predicted | |||
---|---|---|---|
Positive | Negative | ||
Real | Positive | TP = 671 | FN = 105 |
Negative | FP = 24 | TN = 761 |
Concentration mg/mL | 0 | 2 | 9.5 | 14 | 14.5 | 18 | 18.5 | 26 | 27 | 30 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0 | 10.38 | 18.67 | 20.11 | 20.55 | 22.87 | 23.08 | 27.26 | 27.34 | 28.82 |
Manual Area | 0 | 513.6 | 680.1 | 869.25 | 830.25 | 830 | 934.5 | 983.5 | 1373.8 | 1061.5 |
K-means segmentation Area | 0 | 495.5 | 933 | 993.13 | 969.62 | 1132 | 1134.9 | 1422.1 | 1457.2 | 1336.2 |
Otsu segmentation Area | 0 | 535.8 | 1023 | 1174.6 | 1189.8 | 1318 | 1269.1 | 1585.9 | 1646.8 | 1557.1 |
Normalized Data | ||||||||||
Concentration mg/mL | 0.00 | 0.07 | 0.32 | 0.47 | 0.48 | 0.60 | 0.62 | 0.87 | 0.90 | 1.00 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0.00 | 0.36 | 0.65 | 0.70 | 0.71 | 0.79 | 0.80 | 0.95 | 0.95 | 1.00 |
Manual Area | 0.00 | 0.37 | 0.50 | 0.63 | 0.60 | 0.60 | 0.68 | 0.72 | 1.00 | 0.77 |
K-means segmentation Area | 0.00 | 0.34 | 0.64 | 0.68 | 0.67 | 0.78 | 0.78 | 0.98 | 1.00 | 0.92 |
Otsu segmentation Area | 0.00 | 0.33 | 0.62 | 0.71 | 0.72 | 0.80 | 0.77 | 0.96 | 1.00 | 0.95 |
Concentration mg/mL | 0.00 | 0.07 | 0.32 | 0.47 | 0.48 | 0.60 | 0.62 | 0.87 | 0.90 | 1.00 |
Lane | 3 | 2 | 5 | 11 | 4 | 7 | 9 | 10 | 6 | 8 |
ROI-GPN | 0.00 | 0.36 | 0.29 | 0.05 | 0.02 | 0.08 | 0.01 | 0.15 | 0.00 | 0.05 |
Comparison of Manual Area | 0.00 | 0.37 | 0.12 | 0.14 | −0.03 | 0.00 | 0.08 | 0.04 | 0.28 | −0.23 |
Comparison of K-means Segmentation Area | 0.00 | 0.34 | 0.30 | 0.04 | −0.02 | 0.11 | 0.00 | 0.20 | 0.02 | −0.08 |
Comparison of Otsu Segmentation Area | 0.00 | 0.33 | 0.30 | 0.09 | 0.01 | 0.08 | −0.03 | 0.19 | 0.04 | −0.05 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Juarez-Lucero, J.; Guevara-Villa, M.; Sanchez-Sanchez, A.; Diaz-Hernandez, R.; Altamirano-Robles, L. A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms 2024, 17, 149. https://doi.org/10.3390/a17040149
Juarez-Lucero J, Guevara-Villa M, Sanchez-Sanchez A, Diaz-Hernandez R, Altamirano-Robles L. A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms. 2024; 17(4):149. https://doi.org/10.3390/a17040149
Chicago/Turabian StyleJuarez-Lucero, Jorge, Maria Guevara-Villa, Anabel Sanchez-Sanchez, Raquel Diaz-Hernandez, and Leopoldo Altamirano-Robles. 2024. "A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer" Algorithms 17, no. 4: 149. https://doi.org/10.3390/a17040149
APA StyleJuarez-Lucero, J., Guevara-Villa, M., Sanchez-Sanchez, A., Diaz-Hernandez, R., & Altamirano-Robles, L. (2024). A New Algorithm for Detecting GPN Protein Expression and Overexpression of IDC and ILC Her2+ Subtypes on Polyacrylamide Gels Associated with Breast Cancer. Algorithms, 17(4), 149. https://doi.org/10.3390/a17040149