Number Plate Recognition
Updated:
Automatic Number Plate Recognition (ANPR) is used in a variety of applications, such as in parking and ticketless fee automation. In such applications, cameras are usually mounted at the entry points to capture the number plate for entering and exiting vehicles, and perhaps most known, and loathed, use of ANPR is fixed speed cameras. In all these applications, the camera is in a fixed location, with a controlled set up to capture the picture and then process it to detect the number plate.
But what if we needed ANRP with handheld devices, and not in a fixed and controlled setup where image parameters are always the same.
In this example, I will utilise Microsoft’s custom vision service to detect number plate objects in a picture, then pass the number plate object to the Read API service in Azure Cognitive Services to extract the plate number.
The Approach
In this example, a custom object detection model is used first, followed by an OCR step to recognise numbers/digits in a number plate with a high level of certainty. I am assuming that images could come in different qualities, and could be captured by handheld devices (as opposed to a fixed camera with controlled settings). A typical pipeline would involve steps to:
- Load and pre-process the image
- Detect plate in the image
- Image segmentation
- Character recognition
A model is trained and published using customvision.ai to detect the number plate object in the image. OpenCV is then used for image processing steps and Read API in Azure Cognitive Services is used for its OCR functionality.
Object Detection using CustomVision.ai
To get started, head over to https://customvision.ai, sign in and create a new project.
Custom vision allows you to create two types of models:
- Image Classification: to tag the entire image
- Object Detection: to find and tag content within an image
Create an object detection model for this example.
Upload and label training images
Once the project is created, upload sample images. In this case, I uploaded 20 pictures of Sydney taxis that I got using image search on the internet.
After loading the training images, you will need to go through them and label the number plates.
The screenshot above shows a training image, with a manually labelled number plate in the image.
Train model
Start the training process once all training images are labelled. The first time training is kicked off; it could take a while to finish. For this example, I used quick training.
Once training is complete, the tool will present the performance metrics of the trained model. It is essential to observe the precision and recall of the trained model to determine the relevance and quality of the model to the application requirements. Additional training may be required if these results are not satisfactory.
Last, by using the Quick Test function, you could validate the model against a holdout data set. This tool is also useful to understand the type of response to expect from the prediction API.
Publish trained model
Once a trained model is ready and is providing adequate results, the model will need to be published as a REST API to be consumed by downstream applications.
Publishing generates the URL and authentication key for the model.
The code
A custom vision model was trained to detect a number plate in an image, and the predicition API for the trained model was published to be used in this example. This Jupyter notebook (link) on GitHub illustrates how to put together the object detection and then OCR to detect and recognise the text in a number plate.
The following sections explain the different parts of the code.
APIs
APIs for custom vision and computer vision is used in this example. The secrets.ini
file in the code example should include the access keys and URLs for the published customvision.ai prediction API (from the previous section) and API to Azure Cognitive Services instance.
The secrets file is formatted as follows:
[custom_vision]
key = <your published prediction api key>
imgurl = <URL for custom vision api, should end with /image>
[computer_vision]
key = <your computer vision api key>
url = https://australiaeast.api.cognitive.microsoft.com/
The requests
library in python is used to make calls to the APIs in the code example.
Custom Vision API Call
To make the call to custom vision API, the code first loads the image as a byte array, and using opencv
library, the image is serialised into a cv2 image for diplaying and visualisation later on.
data = open('anpr_samples/bike.jpg', 'rb').read()
# decode the image file as a cv2 image,
# useful for later to display results
img = cv2.imdecode(np.array(bytearray(data), dtype='uint8'),
cv2.IMREAD_COLOR)
The REST request must pass in the authentication key to the published custom vision model and the data array for the loaded image.
custom_vision_headers = {
'Content-Type': 'application/octet-stream',
'Prediction-Key': custom_vision_key}
custom_vision_resp = requests.post(url=custom_vision_imgurl,
data=data,
headers=custom_vision_headers).json()
In the following section, the respose from the REST API call is examined to find the detected object. The code below is simplistic in its logic, as it only looks at the returned hit with the highest probability, a more complex logic maybe required to example more hits in the returned result, and not just the highest probablity hit.
# inspect the top result, based on probability
hit = pd.DataFrame(custom_vision_resp['predictions'])
.sort_values(by='probability',ascending=False)
.head(1)
.to_dict()
print(hit)
Here we can see that a plate was detected with a probablity of 0.92, the bounding box element of the response represents the location of the object relative to the size of the image.
{
'probability': {30: 0.926827848},
'tagId': {30: '1752966c-d1e1-46f5-924b-505d90a16981'},
'tagName': {30: 'plate'},
'boundingBox': {30: {
'left': 0.1388305,
'top': 0.486726224,
'width': 0.09517826,
'height': 0.079084456
}
}
}
The bounding box dimensions can be used to draw a box around the detected plate.
# extract the bounding box for the detected number plate
boundingbox = list(hit['boundingBox'].values())[0]
l,t,w,h = (boundingbox['left'],
boundingbox['top'],
boundingbox['width'],
boundingbox['height'])
# extract bounding box coordinates and dimensions are scaled
# using image dimensions
polylines1 = np.multiply([[l,t],[l+w,t],[l+w,t+h],[l,t+h]],
[img.shape[1],img.shape[0]])
# draw polylines based on bounding box results
img2 = cv2.polylines(img,
np.int32([polylines1]),
1, (255,255,0), 4,
lineType=cv2.LINE_AA )
Now that the number plate is detected and located, a cropped image of the plate is saved to a new image object. The cropped plate image is then used for character recognition.
# crop the image to the bounding box of the rego plate
crop_x = polylines1[:,0].astype('uint16')
crop_y = polylines1[:,1].astype('uint16')
img_crop = img2[np.min(crop_y):np.max(crop_y),
np.min(crop_x):np.max(crop_x)]
# display the detected rego plate region
plt.imshow(cv2.cvtColor(img2, cv2.COLOR_BGR2RGB))
Text Recognition
In this section, the cropped image for the detected rego plate is sent to the text extraction API.
Extracting text requires two API calls: One call to submit the image for processing, the other to retrieve the text found in the image.
First the cropped image is convered to a byte array before it is send to the text recognition API.
crop_bytes =bytes(cv2.imencode('.jpg', img_crop)[1])
# make a call to the text_recognition_url
response = requests.post(
url=text_recognition_url,
data=crop_bytes,
headers={
'Ocp-Apim-Subscription-Key': subscription_key,
'Content-Type': 'application/octet-stream'})
The response of the text recognition API call holds the callback URL to retrieve the extracted text. This URL is saved, and then the API is polled for results.
# Holds the URI used to retrieve the recognized text.
response.raise_for_status()
operation_url = response.headers["Operation-Location"]
# The recognized text isn't immediately available, so poll to wait for completion.
analysis = {}
poll = True
while (poll):
response_final = requests.get(
response.headers["Operation-Location"],
headers={'Ocp-Apim-Subscription-Key': subscription_key})
analysis = response_final.json()
print(analysis)
time.sleep(1)
if ("recognitionResults" in analysis):
poll = False
if ("status" in analysis and analysis['status'] == 'Failed'):
poll = False
A successful recognition result object includes the detected lines and text/words within these lines.
{
'status': 'Succeeded',
'recognitionResults': [
{
. . .
'lines': [
{
'boundingBox': [******],
'text': 'EDY.60',
. . .
}]
}
All is left is to extract the text from the response object.
for i,l in enumerate(analysis['recognitionResults'][0]['lines']):
print(i, ': text found: ', [w['text'] for w in l['words']])
0 : text found: ['EDY.60']
Summary
The approach for number plate recognition illustrated in this post is simple yet very effective. No particular knowledge of neural network techniques is required to build a solution with this approach, as it relies on the use of cognitive services and APIs.
Few options can be considered to improve the results further:
- Supplying additional training images in different orientations to enhance the output of object detection
- Post-processing detected objects and examining all returned results of the object detection step
- Training a custom OCR model to improve the text recognition outcomes
Comments