Getting Started with Azure Form Recognizer API
- Before diving into the code, ensure that you have set up your Azure Form Recognizer resources, and obtained the endpoint and API key from the Azure portal.
- Make sure you have installed the Azure.AI.FormRecognizer nuget package in your .NET project to help interact with the API.
- You'll typically work with the FormRecognizerClient class, which provides methods to analyze documents.
dotnet add package Azure.AI.FormRecognizer
Sample Configuration and Initialization
- Initialize the FormRecognizerClient using the endpoint and API key obtained from your Azure account.
- You'll usually initialize this client in a part of your application responsible for handling dependency injection or service initialization.
using Azure;
using Azure.AI.FormRecognizer;
using Azure.AI.FormRecognizer.Models;
using Azure.AI.FormRecognizer.DocumentAnalysis;
using System;
using System.IO;
using System.Threading.Tasks;
namespace FormRecognizerExample
{
public class FormRecognizerService
{
private readonly FormRecognizerClient _formRecognizerClient;
public FormRecognizerService(string endpoint, string apiKey)
{
var credential = new AzureKeyCredential(apiKey);
_formRecognizerClient = new FormRecognizerClient(new Uri(endpoint), credential);
}
}
}
Analyzing Forms with Form Recognizer
- To analyze forms, you'll often use the StartRecognizeContentAsync method for basic text extraction or the StartRecognizeCustomFormsAsync for more complex form processing.
- For custom form models, make sure you have a form model ID from the forms you have trained previously in Azure.
public async Task AnalyzeFormAsync(string filePath)
{
using var stream = new FileStream(filePath, FileMode.Open);
var operation = await _formRecognizerClient.StartRecognizeContentAsync(stream);
var result = await operation.WaitForCompletionAsync();
foreach (var page in result.Value)
{
Console.WriteLine($"Page number: {page.PageNumber}");
foreach (var table in page.Tables)
{
Console.WriteLine("Table data:");
foreach (var cell in table.Cells)
{
Console.WriteLine($"Cell text: '{cell.Content}'");
}
}
}
}
Handling Models and Training Data
- Form Recognizer allows training models on your custom forms. This involves uploading training data and labeling attributes that the models will learn to recognize.
- Make sure your training data is clean and well-organized to improve the accuracy of the form recognizer model.
public async Task TrainModelAsync(string trainingDataUrl)
{
var options = new BuildModelOptions { };
var operation = await _formRecognizerClient.StartBuildModelAsync(trainingDataUrl, options);
var customFormModel = await operation.WaitForCompletionAsync();
Console.WriteLine($"Model ID: {customFormModel.Value.ModelId}");
}
Handling Errors and Debugging
- When working with live APIs, error handling is crucial. Use try-catch blocks and inspect exceptions for better reliability and debugging.
- Azure's SDK provides extensive logging capabilities which can help debug and identify issues in the interaction with the API.
try
{
await AnalyzeFormAsync("path/to/your/document.pdf");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
Optimizing Performance
- Consider optimizing network calls by batching requests or using asynchronous operations, as shown in the examples, to maximize application responsiveness.
- Keep models updated with new data samples to improve accuracy and account for data drift over time.