Number Plate Recognition

Updated:

7 minute read

Automatic Number Plate Recognition (ANPR) is used in a variety of applications, such as in parking and ticketless fee automation. In such applications, cameras are usually mounted at the entry points to capture the number plate for entering and exiting vehicles, and perhaps most known, and loathed, use of ANPR is fixed speed cameras. In all these applications, the camera is in a fixed location, with a controlled set up to capture the picture and then process it to detect the number plate.

But what if we needed ANRP with handheld devices, and not in a fixed and controlled setup where image parameters are always the same.

In this example, I will utilise Microsoft’s custom vision service to detect number plate objects in a picture, then pass the number plate object to the Read API service in Azure Cognitive Services to extract the plate number.

The Approach

In this example, a custom object detection model is used first, followed by an OCR step to recognise numbers/digits in a number plate with a high level of certainty. I am assuming that images could come in different qualities, and could be captured by handheld devices (as opposed to a fixed camera with controlled settings). A typical pipeline would involve steps to:

  1. Load and pre-process the image
  2. Detect plate in the image
  3. Image segmentation
  4. Character recognition

A model is trained and published using customvision.ai to detect the number plate object in the image. OpenCV is then used for image processing steps and Read API in Azure Cognitive Services is used for its OCR functionality.

Object Detection using CustomVision.ai

To get started, head over to https://customvision.ai, sign in and create a new project.

create_new_project

Custom vision allows you to create two types of models:

  • Image Classification: to tag the entire image
  • Object Detection: to find and tag content within an image

Create an object detection model for this example.

Upload and label training images

Once the project is created, upload sample images. In this case, I uploaded 20 pictures of Sydney taxis that I got using image search on the internet.

After loading the training images, you will need to go through them and label the number plates.

data_labeling

The screenshot above shows a training image, with a manually labelled number plate in the image.

Train model

Start the training process once all training images are labelled. The first time training is kicked off; it could take a while to finish. For this example, I used quick training.

train_model

Once training is complete, the tool will present the performance metrics of the trained model. It is essential to observe the precision and recall of the trained model to determine the relevance and quality of the model to the application requirements. Additional training may be required if these results are not satisfactory.

training_complete

Last, by using the Quick Test function, you could validate the model against a holdout data set. This tool is also useful to understand the type of response to expect from the prediction API.

test_trained

Publish trained model

Once a trained model is ready and is providing adequate results, the model will need to be published as a REST API to be consumed by downstream applications.

publish_model

Publishing generates the URL and authentication key for the model.

published_api

The code

A custom vision model was trained to detect a number plate in an image, and the predicition API for the trained model was published to be used in this example. This Jupyter notebook (link) on GitHub illustrates how to put together the object detection and then OCR to detect and recognise the text in a number plate.

The following sections explain the different parts of the code.

APIs

APIs for custom vision and computer vision is used in this example. The secrets.ini file in the code example should include the access keys and URLs for the published customvision.ai prediction API (from the previous section) and API to Azure Cognitive Services instance.

The secrets file is formatted as follows:

[custom_vision]
key = <your published prediction api key>
imgurl = <URL for custom vision api, should end with /image>

[computer_vision]
key = <your computer vision api key>
url = https://australiaeast.api.cognitive.microsoft.com/ 

The requests library in python is used to make calls to the APIs in the code example.

Custom Vision API Call

To make the call to custom vision API, the code first loads the image as a byte array, and using opencv library, the image is serialised into a cv2 image for diplaying and visualisation later on.

data = open('anpr_samples/bike.jpg', 'rb').read()
 # decode the image file as a cv2 image, 
 # useful for later to display results
img = cv2.imdecode(np.array(bytearray(data), dtype='uint8'), 
    cv2.IMREAD_COLOR)

The REST request must pass in the authentication key to the published custom vision model and the data array for the loaded image.

custom_vision_headers = {
    'Content-Type': 'application/octet-stream', 
    'Prediction-Key': custom_vision_key}
custom_vision_resp = requests.post(url=custom_vision_imgurl, 
    data=data, 
    headers=custom_vision_headers).json()

In the following section, the respose from the REST API call is examined to find the detected object. The code below is simplistic in its logic, as it only looks at the returned hit with the highest probability, a more complex logic maybe required to example more hits in the returned result, and not just the highest probablity hit.

 # inspect the top result, based on probability 
hit = pd.DataFrame(custom_vision_resp['predictions'])
    .sort_values(by='probability',ascending=False)
    .head(1)
    .to_dict()
print(hit)

Here we can see that a plate was detected with a probablity of 0.92, the bounding box element of the response represents the location of the object relative to the size of the image.

    {
        'probability': {30: 0.926827848}, 
        'tagId': {30: '1752966c-d1e1-46f5-924b-505d90a16981'}, 
        'tagName': {30: 'plate'}, 
        'boundingBox': {30: {
            'left': 0.1388305, 
            'top': 0.486726224, 
            'width': 0.09517826, 
            'height': 0.079084456
            }
        }
    }

The bounding box dimensions can be used to draw a box around the detected plate.

 # extract the bounding box for the detected number plate 
boundingbox = list(hit['boundingBox'].values())[0]
l,t,w,h = (boundingbox['left'], 
    boundingbox['top'], 
    boundingbox['width'], 
    boundingbox['height'])

 # extract bounding box coordinates and dimensions are scaled 
 # using image dimensions 
polylines1 = np.multiply([[l,t],[l+w,t],[l+w,t+h],[l,t+h]], 
    [img.shape[1],img.shape[0]])

 # draw polylines based on bounding box results
img2 = cv2.polylines(img, 
    np.int32([polylines1]), 
    1, (255,255,0), 4, 
    lineType=cv2.LINE_AA )

svg

Now that the number plate is detected and located, a cropped image of the plate is saved to a new image object. The cropped plate image is then used for character recognition.

 # crop the image to the bounding box of the rego plate

crop_x = polylines1[:,0].astype('uint16')
crop_y = polylines1[:,1].astype('uint16')

img_crop = img2[np.min(crop_y):np.max(crop_y), 
    np.min(crop_x):np.max(crop_x)]

 # display the detected rego plate region
plt.imshow(cv2.cvtColor(img2, cv2.COLOR_BGR2RGB))

svg

Text Recognition

In this section, the cropped image for the detected rego plate is sent to the text extraction API.

Extracting text requires two API calls: One call to submit the image for processing, the other to retrieve the text found in the image.

First the cropped image is convered to a byte array before it is send to the text recognition API.

crop_bytes =bytes(cv2.imencode('.jpg', img_crop)[1])
 # make a call to the text_recognition_url
response = requests.post(
    url=text_recognition_url, 
    data=crop_bytes, 
    headers={
        'Ocp-Apim-Subscription-Key': subscription_key, 
        'Content-Type': 'application/octet-stream'})

The response of the text recognition API call holds the callback URL to retrieve the extracted text. This URL is saved, and then the API is polled for results.

 # Holds the URI used to retrieve the recognized text.
response.raise_for_status()
operation_url = response.headers["Operation-Location"]

 # The recognized text isn't immediately available, so poll to wait for completion.
analysis = {}
poll = True
while (poll):
    response_final = requests.get(
        response.headers["Operation-Location"], 
        headers={'Ocp-Apim-Subscription-Key': subscription_key})
    analysis = response_final.json()
    print(analysis)
    time.sleep(1)
    if ("recognitionResults" in analysis):
        poll = False
    if ("status" in analysis and analysis['status'] == 'Failed'):
        poll = False

A successful recognition result object includes the detected lines and text/words within these lines.

    {
        'status': 'Succeeded', 
        'recognitionResults': [
            {
                . . . 
                'lines': [
                    {
                        'boundingBox': [******], 
                        'text': 'EDY.60', 
                        . . .
                }]
        }

All is left is to extract the text from the response object.

for i,l in enumerate(analysis['recognitionResults'][0]['lines']): 
    print(i, ': text found: ', [w['text'] for w in l['words']])
0 : text found:  ['EDY.60']

svg

Summary

The approach for number plate recognition illustrated in this post is simple yet very effective. No particular knowledge of neural network techniques is required to build a solution with this approach, as it relies on the use of cognitive services and APIs.

Few options can be considered to improve the results further:

  • Supplying additional training images in different orientations to enhance the output of object detection
  • Post-processing detected objects and examining all returned results of the object detection step
  • Training a custom OCR model to improve the text recognition outcomes

Comments