mt.opencv.imgcrop

A module dealing with image croppings and image crops.

Croppings and crops are understood as the followings. Cropping is the act of cutting off parts of an image to form a smaller image, and maybe with a different resolution. Hence, a cropping is analogous to an image transformation. A crop is the result of cropping an image. Hence, a crop is like an image transform.

Functions

  • weight2crop(): Estimates a crop that covers a minimum percentage of the total weight.

  • estimate_cropping(): Estimates a cropping to contain almost all the content of a mask crop.

mt.opencv.imgcrop.weight2crop(weight_image: ndarray, alpha: float = 0.98, thresh: float = 0.0, square: bool = True, padding: float = 0.0) Rect

Estimates a crop that covers a minimum percentage of the total weight.

Parameters:
  • weight_image (numpy.ndarray) – a 2D weight image with shape (height, width) and every pixel has a non-negative weight

  • alpha (float) – threshold to determine the level set beta such that the number of pixels whose value is greather than or equal to beta is greater than or equal to alpha*total weight.

  • thresh (float) – threshold, below which the weight is set to zero

  • square (bool) – whether or not to return a square or a rectangle

  • padding (float) – percentage of padding compared on each dimension to make the returning rect larger than necessary (to make it convincing for food recognition for example)

Returns:

a Rect such that all pixels whose values above beta (see above) are included, and that the total area including padding is as small as possible. If square is True, the returning rectangle is a square.

Return type:

mt.geo2d.rect.Rect

mt.opencv.imgcrop.estimate_cropping(mask_cropping: Cropping, mask_crop: ndarray, out_cropres: list, alpha: float = 0.98, thresh: float = 0.0, square: bool = True, no_subpixel: bool = True, try_to_fit: bool = True) Cropping

Estimates a cropping to contain almost all the content of a mask crop.

The problem the function addresses is as follows. Suppose on a mask image space there is a cropping and its corresponding mask crop. Mask values are non-negative and anything outside the mask crop is treated to have 0 mask value. The goal is to estimate another cropping on the mask image space such that if we apply the new cropping, the total mask values in the new crop is to be not less than alpha times the total mask values on the image space.

The solution involves building a level-set function, finding the optimal level, then bounding on any pixel not lower than that level.

Parameters:
  • mask_cropping (Cropping) – the original mask cropping

  • mask_crop (numpy.ndarray) – a rank-2 array of shape (H, W) (matching the cropres of mask_cropping) representing the corresponding mask crop

  • out_cropres (list) – pair [crop_width, crop_height] defining the cropres of the desired output cropping

  • alpha (float) – threshold to determine the level set beta such that the sum of mask values of selected pixels is not less than alpha times the total mask value. A pixel is selected if its mask value is not less than beta.

  • thresh (float) – threshold, below which the mask value is set to zero

  • square (bool) – whether or not to the resultant crop window is square or rectangle

  • no_subpixel (bool) – whether or not each of the output pixels must be at least as big as an input pixel

  • try_to_fit (bool) – whether or not to try adjust the crop window to fit in the image resolution

Returns:

out_cropping – the output cropping whose imgres matches the imgres of mask_cropping and whose cropres matches out_cropres. The crop window itself is either square or rectangle according to argument square and it is not guaranteed that the crop contains only pixels inside the image with imgres of mask_cropping as the resolution.

Return type:

Cropping

Classes

  • Cropping: An image cropping, the act of cutting a image to a crop window and resizing it.

class mt.opencv.imgcrop.Cropping(imgres: list, window: Rect | None = None, cropres: list = [1, 1], crop: Rect | None = None)

An image cropping, the act of cutting a image to a crop window and resizing it.

Parameters:
  • imgres (list) – pair of [width, height] of the source image

  • window (mt.geo2d.Rect, optional) – the rectangle on the source image defining where to cut/crop. If not given, it is set to be the rectangle capturing the whole image.

  • cropres (list) – pair of [width, height] defining the resolution of the crop after being extracted from the source image

  • crop (mt.geo2dRect, optional) – A different name for argument ‘window’. For backward compatibility only.

Inheritance

digraph inheritance50bc1267e9 { bgcolor=transparent; rankdir=LR; size="8.0, 12.0"; "Cropping" [URL="#mt.opencv.imgcrop.Cropping",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="An image cropping, the act of cutting a image to a crop window and resizing it."]; }
apply(in_image: ndarray, out_image: ndarray | None = None, inter_mode: str = 'bilinear', border_mode: str = 'replicate') ndarray

Applies the cropping to an image and returns the crop.

Parameters:
  • in_image (numpy.ndarray) – input image from which the cropping takes place. It must have the same resolution as the imgres of the cropping.

  • out_image (numpy.ndarray, optional) – output image to be cropped and resized to. If provided, it must have the same resolution as the cropres of the cropping. Otherwise, one is generated with the same dtype and number of channels as the input image, and with the same cropres of the cropping.

  • inter_mode ({'nearest', 'bilinear'}) – interpolation mode. ‘nearest’ means nearest neighbour interpolation. ‘bilinear’ means bilinear interpolation

  • border_mode ({'constant', 'replicate'}) – border filling mode. ‘constant’ means filling zero constant. ‘replicate’ means replicating last pixels in each dimension.

Notes

Since we use OpenCV for warping, the maximum number of channels is 4.

get_img2crop_tfm() Aff2d

Returns the 2D affine transformation mapping source pixels to crop pixels.

Returns:

tfm – output 2D transformation

Return type:

mt.geo2d.affine.Aff2d

get_img2crop_tfm_tf()

Returns the 2D affine transformation TF tensor mapping source pixels to crop pixels.

Returns:

tfm – output 3x3 matrix representing the 2D affine transformation. The 3x3 matrix can be used in tensorflow_graphics.image.transformer.perspective_transform().

Return type:

tensorflow.Tensor

join(other)

Joins with another image cropping to form a composite image cropping.

Parameters:

other (Cropping) – another cropping whose imgres is the same as the current cropres

Returns:

the output composite image cropping, whose imgres is the same as that of self, and cropres is the same as that of other.

Return type:

Cropping

rebase(other)

Rebases the source image.

Suppose the current cropping maps window X of image A to image C and the other cropping maps window Y of image A to image B. The function returns a cropping that maps window Z of image B to image C, where Z is the transform of window X from image A to image B.

Parameters:

other (Cropping) – another cropping whose imgres is the same as the current imgres

Returns:

the output rebased cropping, whose imgres is the same as the cropres of the other cropping, and cropres is the same as that of the current cropping.

Return type:

Cropping