I'll show you the principle of face recognition

Date:2019-12-04


Now more and more mobile phones support face recognition technology. What is the principle of face recognition? And what is the principle of 3D structured light used by some mobile phones? Since it's face recognition, we need the corresponding equipment to obtain the face information of our face.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

At present, there are two ways to obtain our facial information on mobile phones. One is the scheme of the vast number of Android mobile phones at present, which is 2D face recognition directly through the front camera. The other is 3D face recognition realized by components like apple mobile phone and bangs. Although one is 2D and the other is 3D, the principles of face recognition are basically the same, so let's first understand the principles of face recognition. When we use face recognition for the first time, we need to input our face information just like fingerprint unlocking.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

After the camera collects our facial information, we need to process our images first, because when we input facial information, our environment is very different. Some images may be a little weak in light, and some images may be a little noisy. So we need to process the image so that the mobile phone can recognize our facial information more easily. After this step, we need to extract our facial feature information, such as the distance between organs and the geometry of organs, which can be extracted. After extracting our facial feature information, these feature information will be saved. When we unlock the mobile phone, the mobile phone will repeat the previous steps.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

Comparing the extracted facial feature information with the first entered facial feature information, as long as most of the above feature information can be matched, the mobile phone can be unlocked. This is the general principle of our face recognition. Extracting facial information is like extracting secret code. The more information the password contains, the higher the security. Our current Android mobile phone Through a front-end camera, including the infrared camera of Xiaomi eight. All of our face photos are taken directly, and all of them are plane images, that is to say, 2D face recognition. Because what the camera takes is a plane image, so we can directly use a picture to face the camera head, and we can cheat our face recognition, no matter whether we use the camera to take three-dimensional images Face, or directly to a picture, which is a plane image, so 2D face recognition is like a six digit pure digital password.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

In order to improve the security of 2D face recognition, there will be a variety of algorithms, such as border detection, inverse detection and so on. To a certain extent, we can avoid using photo video to cheat 2D face recognition. But the six digit password is a six digit password after all. We can improve the security of face recognition in essence only by increasing the number of digits, increasing the case letters, even the special symbols, and adding more feature information. So we not only need to get the distance and geometry between the face organs, but also get the depth information of the face, which is what we call 3D face recognition technology, so far. There are three kinds of technologies to obtain the depth information of our face, one is TOF (time of flight technology), that is, the sensor emits infrared light, which is reflected back to the sensor from the object surface, and the sensor converts the depth information by looking up the phase between the emitted and reflected light.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

The second is the binocular ranging technology, which is similar to our human eyes. If we shoot directly with two cameras, we will get two different plane images, and then mark the same features on the two images. Finally, the depth information is calculated based on the triangulation principle, but the difficulty of binocular ranging is to mark the common feature points of the two images accurately. What does this mean? For example, when we are shopping with our little friend, the little friend is the camera on the left, and we are the camera on the right. The picture we see now is the perspective of walking on the street. At this time, the little friend said, look at the little sister in the red dress in front of us. We should have a deep understanding at this time, although the little friend said It's the little sister in red, but it's still very difficult for us to find this at once. This is a difficulty in our binocular distance measurement. How can we solve this problem? We can let our little friend take a laser pen, just like his little sister. Now we can find the target immediately. This is what we call 3D structured light technology.

一文带你读懂,人脸识别原理,以及部分手机采用的3D结构光原理

So we can see in the introduction of Apple's or Xiaomi's official website that there is a dot matrix projector and an infrared camera in the bangs. The dot matrix projector projects light points to our faces. The infrared camera directly finds the light points projected on our faces. After finding the light points, the following steps are the same as the binocular distance measurement. Use the principle of triangulation to calculate the depth of a light point Degree information. This is the principle of our 3D structured light, but in apple and Xiaomi's official website, when they show invisible light spots projected on the face, we can find them. Apple and millet have different views. Apple is small dots, while millet is light dots like QR code. What's the difference between them? Apple belongs to speckle structured light, its speckle has certain randomness, so the security will be better, but the corresponding calculation will be larger, and millet belongs to the regular coding structured light, the advantage is that the calculation will be smaller, but compared with the former, the security will be slightly lower.