WO2022217294A1 - Personalized biometric anti-spoofing protection using machine learning and enrollment data - Google Patents
Personalized biometric anti-spoofing protection using machine learning and enrollment data Download PDFInfo
- Publication number
- WO2022217294A1 WO2022217294A1 PCT/US2022/071653 US2022071653W WO2022217294A1 WO 2022217294 A1 WO2022217294 A1 WO 2022217294A1 US 2022071653 W US2022071653 W US 2022071653W WO 2022217294 A1 WO2022217294 A1 WO 2022217294A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- biometric data
- data source
- enrollment
- received image
- features
- Prior art date
Links
- 238000010801 machine learning Methods 0.000 title description 8
- 238000000034 method Methods 0.000 claims abstract description 115
- 238000013528 artificial neural network Methods 0.000 claims abstract description 68
- 230000009471 action Effects 0.000 claims abstract description 14
- 238000013527 convolutional neural network Methods 0.000 claims description 68
- 238000012545 processing Methods 0.000 claims description 53
- 239000013598 vector Substances 0.000 claims description 50
- 230000009466 transformation Effects 0.000 claims description 20
- 230000015654 memory Effects 0.000 claims description 16
- 230000004931 aggregating effect Effects 0.000 claims description 11
- 239000000284 extract Substances 0.000 claims description 8
- 230000001143 conditioned effect Effects 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 5
- 230000000007 visual effect Effects 0.000 description 24
- 238000000605 extraction Methods 0.000 description 19
- 230000001815 facial effect Effects 0.000 description 15
- 238000012549 training Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 10
- 230000002776 aggregation Effects 0.000 description 9
- 238000004220 aggregation Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 8
- 238000001802 infusion Methods 0.000 description 7
- 230000004913 activation Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 238000000844 transformation Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 206010013786 Dry skin Diseases 0.000 description 2
- 206010039792 Seborrhoea Diseases 0.000 description 2
- 206010040954 Skin wrinkling Diseases 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000037336 dry skin Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000037312 oily skin Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000004304 visual acuity Effects 0.000 description 2
- 238000010146 3D printing Methods 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 239000006059 cover glass Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000000465 moulding Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000037303 wrinkles Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/12—Fingerprints or palmprints
- G06V40/1365—Matching; Classification
Definitions
- aspects of the present disclosure relate to machine learning and, more particularly, to using artificial neural networks to protect against biometric credential spoofing in biometric authentication systems.
- Biometric data generally includes information derived from the physical characteristics of a user associated with the biometric data, such as fingerprint data, iris scan data, facial images (e.g., with or without three-dimensional depth data) and the like.
- a user typically enrolls with an authentication service (e.g., executing locally on the device or remotely on a separate computing device) by providing one or more scans of a relevant body part to the authentication service that can be used as a reference data source.
- an authentication service e.g., executing locally on the device or remotely on a separate computing device
- multiple fingerprint scans may be provided to account for differences in the way a user holds a device, to account for differences between different regions of the finger, and to account for different fingers that may be used in authenticating the user.
- multiple facial images captured from multiple angles can be provided to account for differences in the way a user looks at a device.
- the user may scan the relevant body part, and the captured image (or representation thereof) may be compared against a reference (e.g., a reference image or representation thereof). If the captured image is a sufficient match to the reference image, access to the device or application may be granted to the user. Otherwise, access to the device or application may be denied, as an insufficient match may indicate that an unauthorized or unknown user is trying to access the device or application.
- biometric authentication systems add additional layers of security to access controlled systems versus passwords or passcodes
- fingerprints can be authenticated based on similarities between ridges and valleys captured in a query image and captured in one or more enrollment images (e.g., through ultrasonic sensors, optical sensors, or the like). Because the general techniques by which these biometric authentication systems authenticate users is known, it may be possible to attack these authentication systems and gain unauthorized access to protected resources using a reproduction of a user’s fingerprint. These types of attacks may be referred to as fingerprint “spoofing.” In another example, because facial images are widely available (e.g., on the Internet), these images can also be used to attack facial recognition systems.
- Certain aspects provide a method for biometric authentication.
- the method generally includes receiving an image of a biometric data source a user; extracting, through a first artificial neural network, features for at least the received image; combining the extracted features for the at least the received image and a combined feature representation of a plurality of enrollment biometric data source images; determining, using the combined extracted features for the at least the received image and the combined feature representation of the plurality of enrollment biometric data source images as input into a second artificial neural network, whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source; and taking one or more actions to allow or deny the user access to a protected resource based on the determination.
- processing systems configured to perform the aforementioned methods as well as those described herein; non-transitory, computer- readable media comprising instructions that, when executed by one or more processors of a processing system, cause the processing system to perform the aforementioned methods as well as those described herein; a computer program product embodied on a computer-readable storage medium comprising code for performing the aforementioned methods as well as those further described herein; and a processing system comprising means for performing the aforementioned methods, as well as those further described herein.
- FIG. 1 depicts an example fingerprint authentication pipeline.
- FIG. 2 illustrates example anti-spoofing protection systems in a fingerprint authentication pipeline.
- FIG. 3 illustrates example operations for fingerprint authentication, according to aspects of the present disclosure.
- FIG. 4 illustrates a fingerprint anti-spoofing protection pipeline in which a query image and enrollment data are used to determine whether the query image is from a real finger, according to aspects of the present disclosure.
- FIG. 5 illustrates example feature extraction pipelines for extracting fingerprint features from query images and enrollment images, according to aspects of the present disclosure.
- FIG. 6 illustrates example feature aggregation pipelines for aggregating fingerprint features extracted from representations of enrollment images into a consolidated feature set, according to aspects of the present disclosure.
- FIG. 7 illustrates example architectures of neural networks that can be used to aggregate features extracted from a plurality of enrollment images, according to aspects of the present disclosure.
- FIGs. 8A through 8C illustrate example feature infusion pipelines for combining features extracted from the query images and enrollment images for use in determining whether a query image is from a real finger, according to aspects of the present disclosure.
- FIG. 8 illustrates example alignment preprocessing that may be performed on a query image or one or more enrollment images prior to determining whether the query image is from a real finger, according to aspects of the present disclosure.
- FIG. 9 illustrates an example implementation of a processing system in which fingerprint authentication and anti-spoofing protection within a fingerprint authentication pipeline can be performed, according to aspects of the present disclosure.
- aspects of the present disclosure provide techniques for anti-spoofing protection within a biometric authentication pipeline.
- images are generally captured of a biometric characteristic of a user (e.g., a fingerprint image obtained from an image scan or an ultrasonic sensor configured to generate an image based on reflections from ridges and valleys in a fingerprint, face structure derived from a facial scan, iris structure derived from an iris scan, etc.) for use in authenticating the user.
- a biometric characteristic of a user e.g., a fingerprint image obtained from an image scan or an ultrasonic sensor configured to generate an image based on reflections from ridges and valleys in a fingerprint, face structure derived from a facial scan, iris structure derived from an iris scan, etc.
- FAR false acceptance rate
- FRR false rejection rate
- the FAR may represent a rate at which a biometric security system incorrectly allows access to a system or application (e.g., to a user other than the user(s) associated with reference image(s) in the biometric security system), and the FRR may represent a rate at which a biometric security system incorrectly blocks access to a system or application.
- a false acceptance may constitute a security breach, while a false rejection may be an annoyance.
- biometric security systems are frequently used to allow or disallow access to potentially sensitive information or systems, and because false acceptances are generally dangerous, biometric security systems may typically be configured to minimize the FAR to as close to zero as possible, usually with the tradeoff of an increased FRR.
- biometric security systems may be fooled into falsely accepting spoofed biometric credentials, which may allow for unauthorized access to protected resources and other security breaches within a computing system.
- a fake finger created with a fingerprint lifted from another location can be used to gain unauthorized access to a protected computing resource.
- These fake fingers may be easily created, for example, using three-dimensional printing or other additive manufacturing processes, gelatin molding, or other processes.
- images or models of a user’s face can be used to gain unauthorized access to a protected computing resource protected by a facial recognition system.
- biometric authentication systems generally include anti-spoofing protection systems that attempt to distinguish between biometric data from real or fake sources.
- FIG. 1 illustrates an example biometric authentication pipeline 100, in accordance with certain aspects of the present disclosure. While biometric authentication pipeline 100 is illustrated as a fingerprint authentication pipeline, it should be recognized that biometric authentication pipeline 100 may be also or alternatively used in capturing and authenticating other biometric data, such as facial scans, iris scans, and other types of biometric data.
- biometric authentication pipeline 100 may be also or alternatively used in capturing and authenticating other biometric data, such as facial scans, iris scans, and other types of biometric data.
- biometric data such as an image of a fingerprint
- a comparator 120 determines whether the biometric data captured by sensor 110 corresponds to one of a plurality of known sets of biometric data (e.g., whether a captured image of a fingerprint corresponds to a known fingerprint).
- comparator 120 can compare the captured biometric data (or features derived from) to samples in an enrollment sample set (or features derived therefrom) captured when a user enrolls one or more biometric data sources (e.g., fingers) for use in authenticating the user.
- the enrollment image set includes a plurality of images for each biometric data source enrolled in a fingerprint authentication system.
- the actual enrollment images may be stored in a secured region in memory, or a representation of the enrollment images may be stored in lieu of the actual enrollment images to protect against extraction and malicious use of the enrollment images.
- comparator 120 can identify unique physical features within captured biometric data and attempt to match these unique physical features to similar physical features in one of the enrollment samples (e.g., an enrollment image). For example, in a fingerprint authentication system, comparator 120 can identify patterns of ridges and valleys in a fingerprint and/or fingerprint minutiae such as ridge/valley bifurcations or terminations to attempt to match the captured fingerprint to an enrollment image. In another example, in a facial recognition system, comparator 120 can identify various points on a face and identify visual patterns located at these points (e.g., “crows feet” around the eye area, dimples, wrinkles, etc.) in an attempt to match a captured image of a user’ s face to an enrollment image.
- fingerprint authentication system comparator 120 can identify patterns of ridges and valleys in a fingerprint and/or fingerprint minutiae such as ridge/valley bifurcations or terminations to attempt to match the captured fingerprint to an enrollment image.
- comparator 120 can identify various points on a face and identify visual patterns
- comparator 120 may apply various transformations to the captured biometric data to attempt to align features in the captured biometric data with similar features in one or more of the images in the enrollment image set. These transformations may include, for example, applying rotational transformations to (i.e., rotating) the captured biometric data, laterally shifting (i.e., translating) the captured biometric data, scaling the captured biometric data to a defined resolution, combining the captured biometric data with one or more of the enrollment images in the enrollment image set to create a composite image, or the like. If comparator 120 determines that the captured biometric data does not match any of the images in the enrollment image set, comparator 120 can determine that the captured biometric data is not from an enrolled user and can deny access to protected computing resources.
- an anti-spoofing protection engine 130 can determine whether the captured biometric data is from a real source or a fake source. If the captured biometric data is from a real source, anti-spoofing protection engine 130 can allow access to the protected computing resources; otherwise, anti- spoofing protection engine 130 can deny allow access to the protected computing resources.
- Various techniques may be used to determine whether the captured biometric data is from a real source or a fake source. For example, in a fingerprint authentication system, surface conductivity can be used to determine whether the fingerprint image is from a real finger or a fake finger.
- depth maps, temperature readings, and other information can be used to determine whether the source is real or fake, based on an assumption that a real source will have a significant amount of three-dimensional data (as opposed to a printed image which will not have a significant amount of three-dimensional data) and may emit a temperature at or near an assumed normal body temperature (e.g., 98.6° F or 37° C).
- an assumed normal body temperature e.g., 98.6° F or 37° C.
- FIG. 1 illustrates a biometric authentication pipeline in which a comparison is performed prior to determining whether the captured biometric data (e.g., captured image of a fingerprint) is from a real source or a fake source
- the captured biometric data e.g., captured image of a fingerprint
- anti- spoofmg protection engine 130 can determine whether captured biometric data is from a real source or a fake source prior to comparator 120 determining whether a match exists between the biometric data captured by sensor 110 and one or more images in an enrollment image set.
- FIG. 2 illustrates example anti-spoofing protection systems in a fingerprint authentication pipeline.
- a sample 202 captured by a fingerprint sensor e.g., an ultrasonic sensor, an optical sensor, etc.
- ASP anti-spoofing protection
- This anti-spoofing protection model may be trained generically based on a predefined training data set to determine whether the captured sample 202 is from a real finger or a fake finger (e.g., to make a live or spoof decision which may be used in a fingerprint authentication pipeline to determine whether to grant a user access to protected computing resources).
- Anti-spoofing protection model 204 may be inaccurate, as the training data set used to train the anti-spoofing protection model 204 may not account for natural variation between users that may change the characteristics of a sample 202 captured for different users. For example, users may have varying skin characteristics that may affect the data captured in sample 202, such as dry skin, oily skin, or the like. Users with dry skin may, for example, cause generation of a sample 202 with less visual acuity than users with oily skin. Additionally, anti-spoofing protection model 204 may not account for differences between the sensors and/or surface coverings for a sensor used to capture sample 202.
- sensors may have different levels of acuity or may be disposed underneath cover glass of differing thicknesses, refractivity, or the like. Further, different instances of the same model of sensor may have different characteristics due to manufacturing variability (e.g., in alignment, sensor thickness, glass cover thickness, etc.) and calibration differences resulting therefrom. Still further, some users may cover the sensor used to capture sample 202 with a protective film that can impact the image captured by the sensor. Even still, different sensors may have different spatial resolutions.
- aspects of the present disclosure allow for the integration of subject and sensor information into an anti-spoofing protection model 216.
- a sample 212 captured by a fingerprint sensor and information 214 about the subject and/or the sensor may be input into an anti-spoofing protection model 216 trained to predict whether a fingerprint captured in sample 212 is from a real finger or a fake finger.
- the information about the subject and/or the sensor may be, as discussed in further detail below, be derived from an enrollment image set or information derived from images in an enrollment image set.
- anti-spoofing protection model 216 can be trained to identify whether a sample 212 is from a real finger or a fake finger based on user and device characteristics that may not be captured in a generic training data set.
- the accuracy of fingerprint authentication systems in identifying spoofing attacks may be increased, which may increase the security of computing resources protected by fingerprint authentication systems.
- Anti-spoofing protection models may also be used in other biometric authentication systems, such as authentication systems that use iris scanning, facial recognition, or other biometric data. As with the anti-spoofing protection model for a fingerprint authentication pipeline discussed above, anti-spoofing protection models may be inaccurate, because the training data set used to train these models may not account for natural variation between users that may change the characteristics of a sample captured for different users. For examples, users may have varying levels of contrast in iris color that may cause the generation of samples with differing levels of visual acuity, may wear glasses or other optics that affect the details captured in a sample, or the like. Further, the anti-spoofing protection models may not account for differences in the cameras, such as resolution, optical formulas, or the like, that can be used to capture samples used in iris or facial recognition systems.
- anti-spoofing protection systems may be trained using a training data set generated from a large-scale anti-spoofing data set (e.g., in scenarios in which access to sensors and users for data collection is unavailable).
- the personalized data set may include data for a number of different users, with each user having a constant number of enrollment images.
- the first N live samples may be selected as an enrollment data set for each user in the anti- spoofmg data set, and the remaining live samples and a number of spoof samples randomly obtained from other data sources (e.g., image repositories, data sources on the internet, etc.) may be selected as a set of query samples for training the anti-spoofing protecting systems.
- data sources e.g., image repositories, data sources on the internet, etc.
- the N images used as enrollment data may equidistantly sampled from a selected video clip having illumination changes below a threshold value (e.g., such that the biometric data source is captured in the video with minimal changes in lighting and thus in the quality of the data captured in the video) and with variation in subject pose.
- a threshold value e.g., such that the biometric data source is captured in the video with minimal changes in lighting and thus in the quality of the data captured in the video
- Other videos for the user having a same spatial resolution as the selected video clip, may be treated as associated query data against which the anti-spoofing protection system may be trained.
- FIG. 3 illustrates example operations 300 that may be performed for biometric authentication, according to certain aspects of the present disclosure.
- a computing system receives an image of a biometric data source for a user.
- the received image may be an image generated by one of a variety of sensors, such as ultrasonic sensors, optical sensors, or other devices that can capture unique features of a biometric data source, such as a finger, an iris, a user’s face, or the like, for use in authenticating a user of the computing system.
- the received image may be an image in a binary color space. For example, in a binary color space in which images of a fingerprint are captured, a first color represents ridges of a captured fingerprint and a second color represents valleys of the captured fingerprint.
- the received image may be an image in a low- bit-depth monochrome color space in which a first color represents ridges of a captured fingerprint, a second color represents valleys of the captured fingerprint, and colors between the first color and second color represent transitions between valleys and ridges of the captured fingerprint.
- the computing system extracts, through a first artificial neural network, features for at least the received image.
- the first artificial neural network may include, for example, convolutional neural networks (CNNs), transformer neural networks, recurrent neural networks (RNNs), or any of various other suitable artificial neural networks that can be used to extract features from an image or a representation thereof.
- CNNs convolutional neural networks
- RNNs recurrent neural networks
- features may be extracted for the received image and for images in an enrollment image set using neural networks using different weights or using the same weights.
- features may be extracted for the images in the enrollment image set a priori (e.g., when a user enrolls a finger for use in fingerprint authentication, enrolls an iris for use in iris authentication, enrolls a face for use in facial recognition-based authentication, etc.).
- features may be extracted for the images in the enrollment image set based on a non-image representation of the received image (also referred to as a query image) when a user attempts to authenticate through a biometric
- the computing system combines the extracted features for the at least the received image and a combined feature representation of a plurality of enrollment biometric data source images.
- the combined feature representation of the plurality of enrollment biometric data source images may be generated, for example, by aggregating features extracted from individual images of the plurality of enrollment biometric data source images into the combined feature representation.
- the features extracted for the received image and the combined feature representation of the plurality of enrollment biometric data source images may be combined using various feature infusion techniques that can generate a combined set of features, which then may be used to determine whether the received image of the biometric data source for the user is from a real biometric data source or a fake biometric data source that is a copy of the real biometric data source (e.g., a real fingerprint or a fake that is a copy of the real fingerprint).
- the computing system determines, using the combined extracted features for the at least the received image and the combined feature representation of the plurality of enrollment biometric data source images as input into a second artificial neural network, whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source.
- a copy of the real biometric data source may include a replica of the real biometric data source (e.g., a replica of a real fingerprint implemented on a fake finger), a synthesized input generated from minutiae captured from other sources, a synthetically generated and refined image of a biometric data source, or an image of a biometric data source (e.g., from a collection of images) designed to match many users of a fingerprint authentication system.
- a copy of the real biometric data source may also or alternatively include data from non-biometric sources.
- the system can determine whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source using a multilayer perceptron (MLP) neural network or other neural networks that can use the features extracted from the received image and the combined feature representation of the plurality of enrollment biometric data source images to determine whether the received image is from a real biometric data source or a copy of the real biometric data source.
- MLP multilayer perceptron
- the computing system takes one or more actions to allow or deny the user access to a protected resource based on the determination.
- the determination is performed after determining that the received image of the biometric data source matches one or more of the enrollment images
- the computing system can allow the user access to the protected computing resource if the determination is that the image of the biometric data source is from a real biometric data source and can deny the user access to the protected computing resource if the determination is that the image of the biometric data source is from a copy of the real biometric data source.
- the computing system can proceed to perform biometric matching against the enrollment images if the determination is that the image of the fingerprint is from a real fingerprint and can deny the user access to the protected computing resource if the determination is that the image of the biometric data source is from a copy of the real biometric data source without performing biometric matching against the enrollment images.
- FIG. 4 illustrates an anti-spoofing protection pipeline 400 that uses query and enrollment data to determine whether a query image is from a real biometric source, according to aspects of the present disclosure.
- anti-spoofing protection pipeline 400 may be used in a fingerprint authentication system to determine whether a query image is from a real fingerprint. It should be recognized, however, that anti- spoofmg protection pipeline 400 may be applied to enrollment and query images for data obtained from any variety of biometric data sources, such as images of an iris, images of a user’s face, graphical representations of a user’s voice, or other image-based authentication in which images of a biometric data source for a user are used to authenticate the user.
- the fingerprint anti-spoofing protection pipeline 400 may include a feature extraction stage 410, a feature aggregation stage 420, and a feature infusion stage 430.
- anti-spoofing protection pipeline 400 may begin with feature extraction stage 410.
- feature extraction stage 410 convolutional neural networks may be used to extract features from the received query image of the user fingerprint and one or more previously generated enrollment images.
- the enrollment images may be images that a user provided to a fingerprint authentication system when enrolling a finger for use in fingerprint authentication, and these images may be used to determine whether the received query image corresponds to an image of an enrolled fingerprint and to determine whether to grant access to computing resources protected by a fingerprint authentication system.
- feature extraction stage 410 may extract features from the received query image of the user fingerprint and may extract features associated with each of the plurality of enrollment images based on a representation of each of the plurality of enrollment images rather than the enrollment images themselves.
- Features may generally be extracted from the received query image and the one or more previously generated enrollment images using convolutional neural networks.
- these features may be features that are learned by the convolutional neural networks as features that may be useful for a specific classification task (e.g., the fingerprint spoofing discussed herein).
- the features extracted by a last layer of a convolutional neural network may represent concrete qualities of an input image or portions thereof, such as brightness, statistics related to blobs, dots, bifurcations, or the like in an image.
- the features extracted by the convolutional neural networks may also or alternatively include abstract, high-level combinations of features and shapes identified in the received query image and the enrollment images.
- the convolutional neural networks may share parameters, such as weights and biases, or may use different parameters.
- Various techniques may be used to extract features from fingerprint images or data derived from these fingerprint images, as discussed in further detail below with respect to FIG. 5.
- a query image and N enrollment images may be received at feature extraction stage 410.
- the N enrollment images may be processed through a feature extractor ⁇ e ( ⁇ ) to generate a set of features , where , and D represents a number of values that describe the features extracted from each image (also referred to as a dimensionality of the features extracted from the image).
- Feature aggregation stage 420 generally creates a combined feature representation of the plurality of enrollment fingerprint images from the features extracted at feature extraction stage 410 for the plurality of enrollment fingerprint images.
- the combined feature representation may be generated, for example, by concatenating features extracted from the plurality of enrollment fingerprint images into a single set of features.
- Various techniques may be used to generate the combined feature representation, as discussed in further detail below with respect to FIG. 6.
- the feature aggregation stage 420 can combine the enrollment features into a single feature using various techniques, as discussed in further detail below with respect to FIGs. 5 through 7.
- the aggregation of features into may be performed based on vector concatenation, calculation of an arithmetic mean, or other techniques that can be used to aggregate features into a single aggregated feature.
- vector concatenation enrollment features may be concatenated along a given axis to obtain a one-dimensional vector naving dimensions of N * Z).
- an aggregated feature vector may be calculated according to the equation
- the enrollment features extracted from images with i ⁇ ⁇ 1, 2, ... , N ⁇ may be compacted into D values.
- Feature infusion stage 430 generally combines the extracted features for the received image generated in feature extraction stage 410 and the combined feature representation of the plurality of enrollment images generated in feature aggregation stage 420 into data that can be used by MLP 440 to determine whether the received query image is from a real fingerprint or a copy of the real fingerprint.
- Feature infusion stage 430 may use one or more artificial neural networks to combine the extracted features for the received image and the combined feature representation of the plurality of enrollment fingerprint images into a combined set of visual features. Techniques used to combine the extracted features for the received image and the combined feature representation of the plurality of enrollment fingerprint images are discussed in further detail below with respect to FIGs. 7 A through 7C.
- FIG. 5 illustrates various techniques that may be implemented in feature extraction stage 410 for extracting features from the received fingerprint images and the enrollment fingerprint images. Again, wile FIG. 5 illustrates these techniques in the context of fingerprint images, it should be recognized that the feature extraction techniques discussed herein may be applied to enrollment and query images for data obtained from any variety of biometric data sources.
- Example 500A iilllluussttrraatteess ffeeaattuurree eexxttrraaccttiioonn using weight-shared convolutional neural networks.
- two CNNs 502 using the same parameters e.g., weights, biases, etc.
- a combined feature representation 510 may be generated from the output of the CNNs 502.
- an artificial neural network such as MLP 520, can use the combined feature representation 510 to determine whether the received query fingerprint image is from a real fingerprint or a copy of the real fingerprint.
- the output of the artificial neural network may be used to take one or more actions to allow or block access to a protected computing resource.
- the features extracted from the received query image and the enrollment images may have the same or different dimensionality and may be obtained from the same neural network or a different neural network, and the visual features may be spatial features or non-spatial features.
- CNN 502 may, in some aspects, be implemented with multiple layers, with a last layer in the CNN 502 being a global spatial pooling operator.
- CNN 502 may be trained, in some aspects, as part of an end-to-end anti-spoofing protection model.
- CNN 502 may be pre-trained on query images as part of an anti-spoofing protection model. Weights may subsequently be modified to extracted features from the enrollment images captured locally on a computing device.
- Example 500B illustrates feature extraction using weight-separated convolutional neural networks.
- a CNN 502 using a first set of parameters e.g., weights, biases, etc.
- a second CNN 504 using a second set of parameters may be used to extract features from the plurality of enrollment images.
- CNNs 502 and 504 may use different weights and the same or different model architectures to extract visual features from query and enrollment images. Because the weights used in CNNs 502 and 504 are different, the CNNs may be trained to extract different information.
- CNN 502 may be trained to extract images that are discriminative for an anti-spoofing task
- CNN 504 may be trained to extract information from the enrollment images that may be useful for representing the user and/or the sensor(s) used to capture the query and enrollment images.
- CNNs 502 and 504 may be trained jointly, for example, as part of an end-to-end anti-spoofing protection model.
- Example 500C illustrates feature extraction using a weight-hybrid convolutional neural network.
- Example 500C may be considered a hybrid of examples 500A and 500B.
- weight-separated CNNs 502 and 504 may be used to extract a first set of features from the query image and the plurality of enrollment images, respectively, as discussed above with respect to example 500B.
- the first set of features extracted by CNNs 502 and 504 may, as discussed, be low-level features specific to the query image and enrollment image domains, respectively.
- This first set of features may be input into a weight-shared CNN 506, which may be trained to output high-level features for the query image and enrollment images in a shared feature space. That is, combined feature representation 510, generated by the weight-shared CNN 506, may include features in a common feature space generated from low-level features in different feature spaces for the enrollment and query images.
- visual features extracted by the CNNs 502 and 504 may be combined into a stack of visual features.
- the stack of visual features may be input into weight-shared CNN 506 in order to generate the combined feature representation 510.
- the visual features extracted by CNNs 502 and 504 may have a same spatial shape to allow for these features to be stacked.
- convolutional layers in weight-shared CNN 506 may learn filters that compare inputs in spatial dimensions. However, inference may be less efficient, as enrollment image features may be precomputed only up to the input into the weight- shared CNN 506.
- Example 500D illustrates feature extraction from a stack of images including the query image and a plurality of enrollment images.
- the query image and enrollment images may be stacked based on one or more dimensions and fed to a single CNN 502 for feature extraction.
- the images may be spatially aligned so that visual features (e.g., ridges and valleys captured in a fingerprint image) are aligned similarly in each image in the stack of images.
- a combined set of feature representation 510 e.g., of visual features from the stack of images
- the CNN 502 may be trained as part of an end-to-end anti-spoofing protection model and deployed to a computing device on which fingerprint authentication and anti-spoofing protection operations are performed.
- the features extracted from the received fingerprint images and the enrollment fingerprint images may include one or more precomputed features. These precomputed features may include or be derived from other components in an anti- spoofmg system (e.g., temperature, impedance, time, etc.). In some aspects, the precomputed processes may be generated from the received images, such as a number of ridges or valleys in a fingerprint image, signal intensity, or the like. These precomputed features may be extracted similarly from the query and enrollment fingerprint images and may include visual features from the query and enrollment fingerprint images and features associated with metadata about the sensor or the environment in which the computing system operates.
- the precomputed features may be concatenated with the visual features extracted by the one or more CNNs 502, 504, and/or 506 to be the input of an artificial neural network used to determine whether the query fingerprint image is from a real fingerprint or a copy of the real fingerprint.
- the precomputed features may be infused into the one or more CNNs to condition extraction of visual features from the query and enrollment fingerprint images.
- Examples 500A-500D illustrate the use of CNNs to extract features from the query image and the plurality of enrollment images
- any variety of artificial neural networks may be used to extract features from the query image and the plurality of enrollment images.
- features may be extracted from the query image and the plurality of enrollment images using recurrent neural networks, transformer neural networks, or the like.
- features extracted from the received query image may be combined with a combined feature representation of the plurality of enrollment fingerprint images to generate a combined representation that can be processed by an artificial neural network to determine whether the received query image is from a real fingerprint or a copy of the real fingerprint.
- the enrollment fingerprint images generally include multiple images for each enrolled finger, features can be extracted from the images for each finger and aggregated into a single enrollment feature representation.
- Various techniques may be used in feature aggregation stage 420 to combine the features extracted from each enrollment fingerprint image, including non-parametric techniques in which features are concatenated or computed, as well as parametric techniques that learn an optimal technique to combine the features extracted from each enrollment fingerprint image.
- FIG. 6 illustrates various techniques for generating the combined feature representation of the plurality of enrollment fingerprint images.
- Example 600A illustrates an example of generating the combined feature representation of the plurality of enrollment fingerprint images based on image stacking techniques.
- the query image and enrollment fingerprint images may be represented in a three-dimensional space of a channel, width, and height.
- the query image and one or more enrollment fingerprint images may be stacked on the channel dimensions and fed as input into a convolutional neural network 602 to extract visual features 604 from the query fingerprint image and the enrollment fingerprint images.
- CNN 602 may be configured to combine information from the query fingerprint image and enrollment fingerprint images in the stack into a single visual representation. Because CNN 602 may process a same spatial region over multiple channels, generating a combined feature representation based on image stacking may be effective when the query and enrollment images share a same coordinate system (e.g., have the same height, width, and channel dimensions).
- Example 600B illustrates an example of feature stacking, or concatenation, into a concatenated feature output 612.
- each enrollment image 1 through N may be associated with features 1 through N extracted (e.g., a priori , during fingerprint enrollment, etc.) using a CNN, as discussed above.
- a zero vector may be used in its place.
- each feature associated with an enrollment image may have dimensions M x 1
- the concatenated feature output 612 for an enrollment fingerprint image set of N images may have dimensions M * Ax 1.
- features extracted from the received query image may also be concatenated with concatenated feature output 612 to generate the combination of the features extracted from the received query image and the combined feature representation of the plurality of enrollment fingerprint images.
- the combined feature representation of the plurality of enrollment fingerprint images may be compressed into a compact representation in which the features are aggregated.
- Example 600C illustrates an example of generating this compact representation based on mean and standard deviation information.
- features extracted from each enrollment fingerprint image may have dimensions x 1.
- a computing system can calculate the mean across the features extracted from the N enrollment fingerprint images, and additional information, such as standard deviation, higher order moments, or other statistical information may also be calculated from the values of the features extracted from the N enrollment fingerprint images.
- a vector having size Mx 2 may be generated as a concatenation of a mean feature vector 622 and a standard deviation feature vector 624.
- the combined feature representation may be represented as a vector of size x 2
- the memory needed to store the combined feature representation may be reduced from being based on a linear relationship with the number of enrollment fingerprint images to a constant, which may reduce the number of parameters input in a layer of a neural network that processes the aggregated features.
- statistical measures such as mean and standard deviation may be invariant to a number of data points, enrollment finger aggregation based on these statistical measures may be more robust to missing enrollment images in a data set.
- Examples 600A through 600C illustrate non-parametric techniques for aggregating enrollment fingerprint image features and infusing these enrollment fingerprint image features with features extracted from a received query fingerprint image.
- the use of non-parametric features may constrain the expressiveness of a model and its ability to process and combine features.
- various autoregressive models may be used to generate the combined feature representation of the plurality of enrollment fingerprint images, as illustrated in example 600D.
- the features extracted from the enrollment fingerprint images may be processed through an autoregressive model 632 to generate a combined feature output 634 having dimensions Mx 1.
- the autoregressive model 632 may include, for example, recurrent neural networks (RNNs), gated recurrent units (GRUs), long-short term memory (LSTM) models, transformer models, or the like.
- RNNs may be relatively simple, compact, and resource efficient; however, variations of autoregressive models such as GRUs or LSTM models may increase the expressiveness of the model (at the expense of additional multiply-and-accumulate (MAC) operations and a number of parameters).
- Transformer models may allow for relationships to be captured between elements that are distant from each other in the sequence of enrollment fingerprint images and may also allow for invariance with respect to the order in which enrollment fingerprint images are presented to the transformer models.
- autoregressive models may allow a sequence of images having an arbitrary length to be processed into an Mx 1 feature output 634 so that fingerprints may be enrolled using any arbitrary number of enrollment images. Further, autoregressive models may allow the enrollment fingerprint images to be processed sequentially, such as in the order in which the enrollment fingerprint images were captured during fingerprint enrollment. These autoregressive models may, for example, allow for patterns to be learned from the sequence of images, such as increasing humidity and/or temperature at the sensor used to generate the enrollment fingerprint images, which may in turn be used to account for environmental factors that may exist when a sensor captures a fingerprint of a user.
- the inputs and outputs of the GRU may be defined according to the equation: where represents that i th latent feature at layer /, represents a previous activation for layer /, and represents the current activation for layer /.
- the input to the first layer may be the enrollment features The last activation of the final
- GRU layer may be selected as the aggregated feature for the enrollment set, such that
- key-query-value attention mechanisms between query and enrollment features may be used to generate the aggregated features for the enrollment data set and the query image.
- the model may learn the importance of each image in the enrollment data set relative to a specific query image, as discussed in further detail below with respect to FIG. 8B.
- the features of the enrollment images may be aggregated using graph neural networks (GNNs), such as GNN 720 illustrated in FIG. 7, which can model complex relationships between enrollment and query features.
- GNNs graph neural networks
- the enrolment and query features may be represented as nodes in a graph.
- a GNN may operate on a layer-by-layer basis to process the graph.
- GNN 720 includes an adjacency computation block 722 and a graph computation block 724 for a first layer of GNN 720 and an adjacency computation block 726 and graph computation block 728 for a second layer of GNN 720, in which the second layer takes, as input, the graph computed by the graph computation block 724 of the first layer in GNN 720.
- GNN 720 illustrates two layers including an adjacency computation block and a graph computation block, it should be recognized that GNN 720 may include any number of layers.
- multiple adjacency matrices may be computed based on the features in a given node, and the adjacency matrices may be applied in various graph convolution operations.
- An adjacency matrix A may include a plurality of elements obtained using a distance function ⁇ i () between node features and , such that .
- a neural network can parameterize the distance function ⁇ ⁇ ( ⁇ ) such that a scalar value is output from vectors representing node features and .
- a graph convolution operation may be performed according to the equation:
- ⁇ represents a nonlinear function.
- the inputs to the first layer of the GNN may include N + 1 nodes including N enrollment features and the query feature.
- the output features for the query mode may be used as a prediction of whether the query image is an image of a real biometric source for a user being authenticated or a copy of the real biometric source.
- the features can be combined using neural networks. As discussed, the combined features may then be processed through an artificial neural network, such as an MLP, which can generate an output indicating whether the received query image is an image of a real fingerprint or a copy of the real fingerprint.
- an artificial neural network such as an MLP
- Various techniques may be used to combine the query and enrollment fingerprint image features in feature infusion stage 430, including non-parametric techniques and parametric techniques.
- non-parametric techniques for combining features from the query and enrollment fingerprint images may include the use of distance metrics to compare query and enrollment images.
- Parametric techniques may, for example, use self-attention and/or gating mechanisms to learn techniques by which features extracted from the query and enrollment fingerprint images may be combined.
- FIGs. 8A-8C illustrate examples of these various techniques
- FIG. 8A illustrates an example 800A in which features extracted from the query and enrollment fingerprint images are combined based on a likelihood of the received query image being from a real fingerprint, given a mean and standard deviation calculated based on features extracted from the enrollment fingerprint images.
- anMx l feature vector 802 (designated as x) of features extracted from the received query image
- a combined vector 808 with dimensions Mx 1 with each value in the combined vector 808 being calculated as a log likelihood of a probability that x is from a real fingerprint, conditioned on m and s (i.e., as log ⁇ (x
- Mean feature vector 804 and standard deviation vector 806 may be interpreted as a representation of expected features of a live datapoint (e.g., an image captured of a real fingerprint as opposed to a copy of the real fingerprint).
- M Gaussian distributions can be used to model the -dimensional features, and thus, the log-likelihood of each dimension of the query features may be calculated according to the following equation:
- combined vector 808 being an -dimensional representation that combines the enrollment and query features.
- Combined vector 808 may subsequently be processed through an artificial neural network, such as an MLP, to determine whether x corresponds to an image captured from a real fingerprint or a copy of the real fingerprint.
- an artificial neural network such as an MLP
- FIG. 8B illustrates an example 800B in which features extracted from the query and enrollment fingerprint images are combined using attention-based models (e.g., using self-attention).
- a self-attention layer may include a plurality of MLPs.
- MLP_Q 812 may embed the features extracted from the query fingerprint image into a query vector 822.
- MLP_ K 814 may embed enrollment features in a key vector 824, with a same dimensionality as the query vector 822.
- MLP_ V 816 may embed each enrollment fingerprint image feature into a value vector 826.
- the information in key vector 824 may be used to compute an importance of each visual feature in the value vector 826 with respect to features in the query vector 822.
- an inner product may be calculated between the query vector 822 and the key vector 824, and then scale and softmax layers may transform the importance scores to probability values.
- the probability value may be represented according to the equation:
- an attention query may be defined according to the equation:
- the attention keys and values may be generated from the enrollment images according to the equations: and respectively, where A Q , A K , and A v are linear layers that map from a .D-dimensional feature space to an -dimensional feature space.
- the attention weights obtained from and may be applied to value vectors to obtain an aggregated feature .
- the aggregated feature may be represented by the equation: where Q represents corresponds to a query image, K T corresponds to a key image from the set of enrollment images, and V corresponds to a value associated with the pairing of Q and K T .
- the probability values output from importance calculation layer 832 may be linearly combined at combining layer 834 with the values vector 826. This generally results in a linear combination of the values vector 826, which includes an aggregated representation of the enrollment fingerprint image features, conditioned on the query features.
- a skip connection may be used to include the query features in the input of a next layer of a CNN or an MLP classifier 836.
- squeeze-excite gating may be used to aggregate and infuse (combine) the enrollment information given the query features.
- squeeze-excite gating may be used to gate query features, conditioned on the enrollment features.
- a convolutional neural network 840 taking a query image as input, may include a plurality of squeeze-excite modules.
- a squeeze-excite module a stack 842 of intermediate query visual features having width, height, and channel dimensions W x H x C may be squeezed into a C x 1 representation 844, which may be combined with enrollment fingerprint image features and processed through an MLP 846 go generate a size C x i representation 848.
- a product of stack 842 and C x i representation 848 may calculated to generate a stack of features 850, which may also have width, height, and channel dimensions W x H x C.
- the gating may be performed on the channel dimension of the visual features and may be performed at any layer in CNN 840 that is parsing the query image.
- an anti-spoofing protection model may have access to outputs of a fingerprint matching system, which may be used to condition an anti-spoofing protection model to use the most informative enrollment image(s) for a given finger.
- the anti-spoofing protection model may receive, from a fingerprint matching system, information identifying the enrollment fingerprint image that matches the query fingerprint image.
- the anti-spoofing protection image can receive, from the fingerprint matching system, information about the transformation applied to the query or enrollment image to find the matching enrollment image.
- the information about the transformation may be represented as a matrix such that the transformed image is calculated as the product of a transformation matrix and the original image. That is, for any given transformation, the transformed image may be represented by the equation:
- FIG. 9 illustrates an example of alignment preprocessing that may be performed on a query image or one or more enrollment images prior to determining whether the query image is from a real fingerprint, according to aspects of the present disclosure.
- a transformation may be applied to the matching enrollment image 904 to generate a combined image 906.
- the combined image 906 may include a transformation of the enrollment image to the coordinate system of the query image, and the combined image 906 may be padded to generate input image 908.
- Input image 908, including the padded combination of the query image 902 and matching enrollment image 904, may be input into an anti-spoofing protection model in which a CNN 910 extracts visual features 912 from the combination of the query image 902 and matching enrollment image 904, and the visual features 912 are processed through a neural network, such as MLP 914, to determine whether the query image 902 is from a real fingerprint or a copy of the real fingerprint.
- a CNN 910 extracts visual features 912 from the combination of the query image 902 and matching enrollment image 904
- the visual features 912 are processed through a neural network, such as MLP 914, to determine whether the query image 902 is from a real fingerprint or a copy of the real fingerprint.
- MLP 914 neural network
- Various techniques may be used to leverage spatial alignment information in an anti-spoofing protection model.
- the query image and aligned enrollment image may be stacked in the channel dimension, and the CNN can learn filters that compare features across the spatially aligned inputs.
- difference techniques that subtract the enrollment image from the query image may be used to highlight features that change between the enrollment image and the query image in overlapping areas.
- overlay techniques may allow a CNN to observe how shapes combine (e.g., at the edges of images). Intersection techniques in which only the intersection of the query and enrollment images are present to a CNN may constrain the CNN to examine features that can be compared and may exclude content for which the CNN has no reference.
- image stitching techniques may be used where geometric transformation coefficients are available for a plurality of enrollment fingerprint images.
- each image in the plurality of enrollment fingerprint images may be transported to the same spatial coordinates and stitched together, which may allow a larger area of the enrolled finger to be recovered and increase the coverage of the enrollment fingerprint information with respect to a single captured query fingerprint image.
- query and enrollment images may be spatially aligned through three-dimensional transformations.
- the enrollment images may be transformed using three-dimensional rotations and shifts such that the query image and aligned enrollment images can be stacked in one or more channel dimensions.
- the performance of the anti-spoofing protection models described herein may be based on the domain, task, data set, and hardware under consideration.
- the anti-spoofing protection model architecture described herein may be based on CNN and MLP components.
- the CNN may have eleven two-dimensional convolutional layers, alternated with two-dimensional batch normalization layers and rectified linear unit (ReLU) activation functions.
- ReLU rectified linear unit
- the same architecture may be maintained for the CNNs used to extract features from the received query fingerprint image and the plurality of enrollment fingerprint images.
- the CNN may be divided between the separated and shared portions after a convolutional layer that is approximately in the middle of the CNN.
- the CNN kernels may have a receptive field with 3 x 3 dimensions and may alternate between strides to downsample original images.
- the input of the network may, for example, have three dimensions (namely, width, height, and channel dimensions) of (180, 80, 2).
- the output visual features may have a shape, in the width, height, and channel dimensions of (3, 2, 32), which allows the CNN to capture different features on the channel dimension and retain some spatial information within the 3 x 2 spatial coordinates.
- the MLP may have four linear layers alternating between batch normalization and ReLU activation functions, and may omit a dropout function.
- An input array including approximately 200 features, may be gradually compressed through the MLP until the compression results in a two-dimensional output.
- the output generally includes the scores for an input being a live sample (e.g., from a real biometric data source) and the input being a spoof sample (e.g., from a copy of the real biometric data source).
- a softmax function may map these values into probabilities.
- the MLP may be trained using supervised learning techniques, for example, leveraging cross-entropy loss.
- aspects of the present disclosure leverage enrollment data to determine whether a query image is from a real biometric data source or a copy of the real biometric data source.
- the anti-spoofing protection models described herein can extract sensor-specific information from enrollment data by taking the enrollment data as a reference and can extract subject- specific information from the enrollment data. While access to the enrollment data is needed, aspects of the present disclosure may pre-process the enrollment data into extracted features during sensor calibration and enrollment, which may allow the anti- spoofmg protection models herein to access an abstract representation of the enrollment fingerprint images. Further, the features extracted from the enrollment images may be precomputed, which may reduce memory and compute costs for fingerprint authentication and anti-spoofing protection.
- query images and enrollment images may be processed through the neural network(s). Training may be optimized based on the hardware on which the models are trained, for example, by constraining the size of the neural network, or by loading partial data sets into memory and a processor used to train the neural network(s).
- features from the enrollment images may be at least partially pre-computed and stored up to the point at which the features are combined with the query features, which may reduce compute time and memory used to perform an inference with respect to whether the query images are from a real biometric data source or a copy of the real biometric data source.
- the behavior of the anti-spoofing protection models described herein may be finger and user agnostic, as the anti-spoofing protection models may be configured to focus on the relevant enrollment image set for the biometric data source and the user being authenticated.
- the personalized anti-spoofing protection model described herein may provide for improved accuracy of anti-spoofing protection compared to non- personalized anti-spoofing protection models.
- Spoofing attacks generally fail at a higher rate when processed through the personalized anti-spoofing protection models described herein than when processed through non-personalized anti-spoofing protection models.
- spoofing attacks generally fail at a higher rate using the personalized anti- spoofing protection models, aspects of which are described herein, computing systems may be made more secure against attempts to gain unauthorized access to protected computing resources using fake biometric data sources and/or images derived therefrom.
- FIG. 10 depicts an example processing system 1000 for biometric authentication using machine learning-based anti-spoofing protection, such as described herein for example with respect to FIG. 3.
- Processing system 1000 includes a central processing unit (CPU) 1002, which in some examples may be a multi-core CPU. Instructions executed at the CPU 1002 may be loaded, for example, from a program memory associated with the CPU 1002 or may be loaded from a partition in memory 1024.
- CPU central processing unit
- Processing system 1000 also includes additional processing components tailored to specific functions, such as a graphics processing unit (GPU) 1004, a digital signal processor (DSP) 1006, a neural processing unit (NPU) 1008, a multimedia processing unit 1010, a multimedia processing unit 1010, and a wireless connectivity component 1012.
- GPU graphics processing unit
- DSP digital signal processor
- NPU neural processing unit
- 1010 multimedia processing unit
- 1010 multimedia processing unit
- An NPU such as 1008, is generally a specialized circuit configured for implementing all the necessary control and arithmetic logic for executing machine learning algorithms, such as algorithms for processing artificial neural networks (ANNs), deep neural networks (DNNs), random forests (RFs), and the like.
- An NPU may sometimes alternatively be referred to as a neural signal processor (NSP), tensor processing units (TPU), neural network processor (NNP), intelligence processing unit (IPU), vision processing unit (VPU), or graph processing unit.
- NSP neural signal processor
- TPU tensor processing units
- NNP neural network processor
- IPU intelligence processing unit
- VPU vision processing unit
- graph processing unit graph processing unit
- NPUs such as 1008, are configured to accelerate the performance of common machine learning tasks, such as image classification, machine translation, object detection, and various other predictive models.
- a plurality of NPUs may be instantiated on a single chip, such as a system on a chip (SoC), while in other examples they may be part of a dedicated neural -network accelerator.
- SoC system on a chip
- NPUs may be optimized for training or inference, or in some cases configured to balance performance between both.
- the two tasks may still generally be performed independently.
- NPUs designed to accelerate training are generally configured to accelerate the optimization of new models, which is a highly compute-intensive operation that involves inputting an existing dataset (often labeled or tagged), iterating over the dataset, and then adjusting model parameters, such as weights and biases, in order to improve model performance. Generally, optimizing based on a wrong prediction involves propagating back through the layers of the model and determining gradients to reduce the prediction error.
- NPUs designed to accelerate inference are generally configured to operate on complete models. Such NPUs may thus be configured to input a new piece of data and rapidly process it through an already trained model to generate a model output (e.g., an inference).
- NPU 1008 is a part of one or more of CPU 1002, GPU 1004, and/or DSP 1006.
- wireless connectivity component 1012 may include subcomponents, for example, for third generation (3G) connectivity, fourth generation (4G) connectivity (e.g., 4G LTE), fifth generation connectivity (e.g., 5G or NR), Wi-Fi connectivity, Bluetooth connectivity, and other wireless data transmission standards.
- Wireless connectivity component 1012 is further connected to one or more antennas 1014.
- Processing system 1000 may also include one or more sensor processing units 1016 associated with any manner of sensor, one or more image signal processors (ISPs) 1018 associated with any manner of image sensor, and/or a navigation processor 1020, which may include satellite-based positioning system components (e.g., GPS or GLONASS) as well as inertial positioning system components.
- ISPs image signal processors
- navigation processor 1020 which may include satellite-based positioning system components (e.g., GPS or GLONASS) as well as inertial positioning system components.
- Processing system 1000 may also include one or more input and/or output devices 1022, such as screens, touch-sensitive surfaces (including touch-sensitive displays), physical buttons, speakers, microphones, and the like.
- input and/or output devices 1022 such as screens, touch-sensitive surfaces (including touch-sensitive displays), physical buttons, speakers, microphones, and the like.
- one or more of the processors of processing system 1000 may be based on an ARM or RISC-V instruction set.
- Processing system 1000 also includes memory 1024, which is representative of one or more static and/or dynamic memories, such as a dynamic random access memory, a flash-based static memory, and the like.
- memory 1024 includes computer-executable components, which may be executed by one or more of the aforementioned processors of processing system 1000.
- memory 1024 includes image feature extracting component 1024 A, feature representation combining component 1024B, biometric authenticity determining component 1024C, and user access controlling component 1024D.
- image feature extracting component 1024 A image feature extracting component 1024 A
- feature representation combining component 1024B feature representation combining component 1024B
- biometric authenticity determining component 1024C biometric authenticity determining component 1024C
- user access controlling component 1024D user access controlling component 1024D.
- the depicted components, and others not depicted, may be configured to perform various aspects of the methods described herein.
- processing system 1000 and/or components thereof may be configured to perform the methods described herein.
- aspects of processing system 1000 may be omitted, such as where processing system 1000 is a server computer or the like.
- multimedia processing unit 1010, wireless connectivity component 1012, sensor processing units 1016, ISPs 1018, and/or navigation processor 1020 may be omitted in other embodiments.
- aspects of processing system 1000 may be distributed, such as training a model and using the model to generate inferences, such as user verification predictions.
- a method of biometric authentication comprising: receiving an image of a biometric data source for a user; extracting, through a first artificial neural network, features for at least the received image; combining the extracted features for the at least the received image and a combined feature representation of a plurality of enrollment biometric data source images; determining, using the combined extracted features for the at least the received image and the combined feature representation of the plurality of enrollment biometric data source images as input into a second artificial neural network, whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source; and taking one or more actions to allow or deny the user access to a protected resource based on the determination.
- Clause 2 The method of Clause 1, further comprising aggregating features extracted by a neural network from information derived from a plurality of enrollment biometric data source images into the combined feature representation of the plurality of enrollment biometric data source images.
- Clause 3 The method of Clause 2, wherein the features extracted from the information derived from the plurality of enrollment biometric data source images are extracted during user fingerprint enrollment.
- Clause 4 The method of any one of Clauses 2 or 3, wherein the features extracted from the information derived from the plurality of enrollment biometric data source images comprise features extracted from a representation derived from each of the plurality of enrollment biometric data source images.
- Clause 5 The method of any one of Clauses 2 through 4, wherein aggregating features extracted from the information derived from the plurality of enrollment biometric data source images into the combined feature representation comprises concatenating features extracted from each of the plurality of enrollment biometric data source images into a single set of features.
- Clause 6 The method of any one of Clauses 2 through 4, wherein aggregating features extracted from the information derived from the plurality of enrollment biometric data source images into the combined feature representation comprises generating a feature output based on an autoregressive model and features extracted from each of the plurality of enrollment biometric data source images.
- Clause 7 The method of any one of Clauses 2 through 4, wherein aggregating features extracted from the information derived from the plurality of enrollment biometric data source images into the combined feature representation comprises generating, from the features extracted from the plurality of enrollment biometric data source images, an average and a standard deviation associated with the features extracted from the plurality of enrollment biometric data source images.
- Clause 8 The method of any one of Clauses 2 through 7, wherein: the first neural network and the second neural network comprise convolutional neural networks, and the first artificial neural network shares at least a subset of weights associated with the second artificial neural network.
- Clause 9 The method of any one of Clauses 2 through 8, further comprising extracting additional features from the received image and the plurality of enrollment images using a weight-shared convolutional neural network, the extracted features for the received image, and the features extracted from the plurality of enrollment biometric data source images.
- Clause 10 The method of any one of Clauses 1 through 9, wherein extracting features for the at least the received image comprises: combining the received image and the plurality of enrollment biometric data source images into a stack of images; and extracting the features for the received image and features for each of the plurality of enrollment biometric data source images by processing the stack of images through the first artificial neural network.
- Clause 11 The method of Clause 10, wherein combining the received image and the plurality of enrollment biometric data source images into the stack of images comprises: identifying, relative to at least one image of the plurality of enrollment biometric data source images, a transformation to apply to the received image such that the received image is aligned with at least a portion of the at least one image of the plurality of enrollment biometric data source images; modifying the received image based on the identified transformation; and generating a stack including the modified received image and the at least the one image of the plurality of enrollment biometric data source images.
- Clause 12 The method of Clause 11, wherein generating the stack including the modified received image and the plurality of enrollment biometric data source images comprises one or more of: stacking the modified received image and the at least the one image of the plurality of enrollment biometric data source images on a channel dimension, subtracting the modified received image from the at least the one image of the plurality of enrollment biometric data source images, overlaying the received image on the at least the one image of the plurality of enrollment biometric data source images, outputting an intersection of the modified received image and the at least the one image of the plurality of enrollment biometric data source images, or transforming the modified received image based on a stitched version of the plurality of enrollment biometric data source images.
- Clause 13 The method of Clause 10, wherein combining the received image and the plurality of enrollment biometric data source images into the stack of images comprises: identifying, relative to the received image, a transformation to apply at least one image of the plurality of enrollment biometric data source images such that the received image is aligned with at least a portion of the at one image of the plurality of enrollment biometric data source images; modifying the at least the one image of the plurality of enrollment biometric data source images based on the identified transformation; and generating a stack including the received image and the modified at least one image of the plurality of enrollment biometric data source images.
- Clause 14 The method of Clause 13, wherein generating the stack including the received image and the modified at least the one image of the plurality of enrollment biometric data source images comprises: stacking the received image and the modified at least the one image of the plurality of enrollment biometric data source images on a channel dimension, subtracting the received image from the modified at least the one image of the plurality of enrollment biometric data source images, overlaying the received image on the modified at least the one image of the plurality of enrollment biometric data source images, or outputting an intersection of the received image and the modified at least the one image of the plurality of enrollment biometric data source images.
- Clause 15 The method of any one of Clauses 1 through 14, wherein determining whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source comprises calculating a distance metric comparing the received image and the plurality of enrollment biometric data source images.
- Clause 16 The method of any one of Clauses 1 through 14, wherein determining whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source comprises calculating a log likelihood of the received image being from a real biometric data source, given a mean and a standard deviation associated with the features extracted from the plurality of enrollment biometric data source images.
- Clause 17 The method of any one of Clauses 1 through 14, wherein determining whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source comprises weighting the extracted features for the received image and the features extracted from the plurality of enrollment biometric data source images using a key-query -value attention layer.
- Clause 18 The method of any one of Clauses 1 through 14, wherein determining whether the received image of the biometric data source for the user is from a real biometric data source or a copy of the real biometric data source comprises: embedding the extracted features for the received image into a query vector using a first multi-layer perceptron; embedding the features extracted from the plurality of enrollment biometric data source images into a key vector using a second multi-layer perceptron; embedding the features extracted from the plurality of enrollment biometric data source images into a value vector using a third multi-layer perceptron; and generating a value corresponding to a likelihood that the received image is from a real biometric data source based on an inner product between the query vector and the key vector, conditioned on features embedded into the query vector.
- Clause 19 The method of any one of Clauses 1 through 14, wherein determining whether the received image of the biometric data source from the user is from a real biometric data source or a copy of the real biometric data source comprises gating one or more of the extracted features for the received image based on features extracted from the plurality of enrollment biometric data source images.
- Clause 20 The method of any one of Clauses 1 through 14, wherein: determining whether the received image of the biometric data source from the user is from a real biometric data source or a copy of the real biometric data source comprises gating the extracted features for the received image in a squeeze-excite network based on the features extracted from the plurality of enrollment biometric data source images; the extracted features are represented by a height dimension, a width dimension, and a channel dimension; and the gating is performed on the channel dimension.
- Clause 21 The method of any one of Clauses 1 through 20, wherein the received image of the biometric data source for the user comprises an image of a fingerprint of the user.
- Clause 22 The method of any one of Clauses 1 through 21, wherein the received image of the biometric data source for the user comprises an image of a face of the user.
- Clause 23 A processing system, comprising: a memory comprising computer- executable instructions and one or more processors configured to execute the computer- executable instructions and cause the processing system to perform a method in accordance with any one of Clauses 1-22.
- Clause 24 A processing system, comprising means for performing a method in accordance with any one of Clauses 1-22.
- Clause 25 A non-transitory computer-readable medium comprising computer-executable instructions that, when executed by one or more processors of a processing system, cause the processing system to perform a method in accordance with any one of Clauses 1-22.
- Clause 26 A computer program product embodied on a computer-readable storage medium comprising code for performing a method in accordance with any one of Clauses 1-22.
- an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein.
- the scope of the disclosure is intended to cover such an apparatus or method that is practiced using other structure, functionality, or structure and functionality in addition to, or other than, the various aspects of the disclosure set forth herein. It should be understood that any aspect of the disclosure disclosed herein may be embodied by one or more elements of a claim.
- exemplary means “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.
- a phrase referring to “at least one of’ a list of items refers to any combination of those items, including single members.
- “at least one of: a, b, or c” is intended to cover a, b, c, a-b, a-c, b-c, and a-b-c, as well as any combination with multiples of the same element (e.g., a-a, a-a-a, a-a-b, a-a-c, a-b-b, a-c-c, b-b, b-b-b, b-b-c, c-c, and c-c-c or any other ordering of a, b, and c).
- determining encompasses a wide variety of actions. For example, “determining” may include calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining and the like. Also, “determining” may include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory) and the like. Also, “determining” may include resolving, selecting, choosing, establishing, and the like.
- the methods disclosed herein comprise one or more steps or actions for achieving the methods.
- the method steps and/or actions may be interchanged with one another without departing from the scope of the claims.
- the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claims.
- the various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions.
- the means may include various hardware and/or software component(s) and/or module(s), including, but not limited to a circuit, an application specific integrated circuit (ASIC), or processor.
- ASIC application specific integrated circuit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Collating Specific Patterns (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020237033025A KR20230169104A (en) | 2021-04-09 | 2022-04-09 | Personalized biometric anti-spoofing protection using machine learning and enrollment data |
BR112023019936A BR112023019936A2 (en) | 2021-04-09 | 2022-04-09 | PERSONALIZED BIOMETRIC ANTI-COOKING PROTECTION USING MACHINE LEARNING AND REGISTRATION DATA |
CN202280025687.7A CN117121068A (en) | 2021-04-09 | 2022-04-09 | Personalized biometric anti-fraud protection using machine learning and enrollment data |
EP22719496.6A EP4320606A1 (en) | 2021-04-09 | 2022-04-09 | Personalized biometric anti-spoofing protection using machine learning and enrollment data |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163173267P | 2021-04-09 | 2021-04-09 | |
US63/173,267 | 2021-04-09 | ||
US17/658,573 | 2022-04-08 | ||
US17/658,573 US20220327189A1 (en) | 2021-04-09 | 2022-04-08 | Personalized biometric anti-spoofing protection using machine learning and enrollment data |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022217294A1 true WO2022217294A1 (en) | 2022-10-13 |
Family
ID=81389139
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2022/071653 WO2022217294A1 (en) | 2021-04-09 | 2022-04-09 | Personalized biometric anti-spoofing protection using machine learning and enrollment data |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022217294A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117973527A (en) * | 2024-04-02 | 2024-05-03 | 云南师范大学 | Knowledge tracking method based on GRU capturing problem context characteristics |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200175290A1 (en) * | 2017-06-30 | 2020-06-04 | Norwegian University Of Science And Technology | Detection of manipulated images |
US20200394289A1 (en) * | 2019-06-14 | 2020-12-17 | Microsoft Technology Licensing, Llc | Biometric verification framework that utilizes a convolutional neural network for feature matching |
US20210009080A1 (en) * | 2019-02-28 | 2021-01-14 | Shanghai Sensetime Lingang Intelligent Technology Co., Ltd. | Vehicle door unlocking method, electronic device and storage medium |
-
2022
- 2022-04-09 WO PCT/US2022/071653 patent/WO2022217294A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200175290A1 (en) * | 2017-06-30 | 2020-06-04 | Norwegian University Of Science And Technology | Detection of manipulated images |
US20210009080A1 (en) * | 2019-02-28 | 2021-01-14 | Shanghai Sensetime Lingang Intelligent Technology Co., Ltd. | Vehicle door unlocking method, electronic device and storage medium |
US20200394289A1 (en) * | 2019-06-14 | 2020-12-17 | Microsoft Technology Licensing, Llc | Biometric verification framework that utilizes a convolutional neural network for feature matching |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117973527A (en) * | 2024-04-02 | 2024-05-03 | 云南师范大学 | Knowledge tracking method based on GRU capturing problem context characteristics |
CN117973527B (en) * | 2024-04-02 | 2024-06-07 | 云南师范大学 | Knowledge tracking method based on GRU capturing problem context characteristics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220327189A1 (en) | Personalized biometric anti-spoofing protection using machine learning and enrollment data | |
US11657525B2 (en) | Extracting information from images | |
US11941918B2 (en) | Extracting information from images | |
US20200082062A1 (en) | User adaptation for biometric authentication | |
Arora et al. | A robust framework for spoofing detection in faces using deep learning | |
El Khiyari et al. | Age invariant face recognition using convolutional neural networks and set distances | |
US20220012511A1 (en) | Systems and methods for enrollment in a multispectral stereo facial recognition system | |
WO2013181695A1 (en) | Biometric verification | |
Lakshmi et al. | Off-line signature verification using Neural Networks | |
Appati et al. | Implementation of a Transform‐Minutiae Fusion‐Based Model for Fingerprint Recognition | |
Choras | Multimodal biometrics for person authentication | |
Saied et al. | A novel approach for improving dynamic biometric authentication and verification of human using eye blinking movement | |
Rajalakshmi et al. | A multimodal architecture using Adapt‐HKFCT segmentation and feature‐based chaos integrated deep neural networks (Chaos‐DNN‐SPOA) for contactless biometricpalm vein recognition system | |
Ramya et al. | An efficient Minkowski distance-based matching with Merkle hash tree authentication for biometric recognition in cloud computing | |
Kumar et al. | Face and Iris‐Based Secured Authorization Model Using CNN | |
Zhang et al. | 2D fake fingerprint detection based on improved CNN and local descriptors for smart phone | |
Kavitha et al. | Fuzzy local ternary pattern and skin texture properties based countermeasure against face spoofing in biometric systems | |
WO2022217294A1 (en) | Personalized biometric anti-spoofing protection using machine learning and enrollment data | |
Kuznetsov et al. | Biometric authentication using convolutional neural networks | |
Berriche | Comparative Study of Fingerprint‐Based Gender Identification | |
Gupta et al. | Biometric iris identifier recognition with privacy preserving phenomenon: A federated learning approach | |
Shirke et al. | Optimization driven deep belief network using chronological monarch butterfly optimization for iris recognition at-a-distance | |
Bokade et al. | An ArmurMimus multimodal biometric system for Khosher authentication | |
Khade et al. | Machine learning-based iris liveness identification using fragmental energy of cosine transformed iris images | |
Ramya et al. | A comparative analysis of similarity distance measure functions for biocryptic authentication in cloud databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22719496 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112023019936 Country of ref document: BR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022719496 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022719496 Country of ref document: EP Effective date: 20231109 |
|
ENP | Entry into the national phase |
Ref document number: 112023019936 Country of ref document: BR Kind code of ref document: A2 Effective date: 20230927 |