Object recognition
This is to ask for help to address a shortcoming that I want to improve, rather than reporting a serious problem.
When one crops the images out of a large scanned image, the initial guess that the system throws is 80% FOV, centred at the image and if I understand correctly, it seems to be done by this section of code in src/app-window.vala:
else if (crop_name == "custom")
{
var width = page.width;
var height = page.height;
var crop_width = (int) (width * 0.8 + 0.5);
var crop_height = (int) (height * 0.8 + 0.5);
page.set_custom_crop (crop_width, crop_height);
page.move_crop ((width - crop_width) / 2, (height - crop_height) / 2);
}
This means >= 2 clicks to position it properly. I instead want to guess the position of the physical object within the FOV, say by quantifying high frequency fluctuations in it. The real object is likely to have much more oscillations than the constant white background of the lid of the scanner. So, I would like to introduce the following:
else if (crop_name == "auto")
{
var dx = conv2D (image, sobel_x);
var dy = conv2D (image, sobel_y);
var fluctuation = dx.^2 + dy.^2;
var is_object = (fluctuation > threshold1 ? true : false);
var xpos_hist = histogram(is_object, 1);
var ypos_hist = histogram(is_object, 2);
int x1 = find_first (xpos_hist > threshold2);
int y1 = find_first (ypos_hist > threshold2);
int x2 = find_last (xpos_hist > threshold2);
int y2 = find_last (ypos_hist > threshold2);
var crop_width = x2-x1;
var crop_height = y2-y1;
page.set_custom_crop (crop_width, crop_height);
page.move_crop (x1,y1);
}
But I could not figure out what library I am supposed to use and link for linear algebra. For a GNOME project, may I use OpenCV? Can somebody help me with this?
Thank you